date:20241023

[Regression] wifi problems since tg3 started throwing rcu stall warnings

2024-10-23 Thread Linux regression tracking (Thorsten Leemhuis)

Hi, Thorsten here, the Linux kernel's regression tracker.

Frederic, I noticed a report about a regression in bugzilla.kernel.org
that appears to be caused by the following change of yours:

55d4669ef1b768 ("rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU
invocation")

As many (most?) kernel developers don't keep an eye on the bug tracker,
I decided to write this mail. To quote from
https://bugzilla.kernel.org/show_bug.cgi?id=219390:

>  Mingcong Bai 2024-10-15 13:32:35 UTC
> 
> Since aa162aa4aa383a0a714b1c36e8fcc77612ddd1a2 between v6.10.4 and
> v6.10.5, the Broadcom Tigon3 Ethernet interface (tg3) found on Apple
> MacBook Pro (15'', Mid 2010) would throw many rcu stall errors during
> boot up, causing peripherals such as the wireless card to misbehave:
> 
> [   24.153855] rcu: INFO: rcu_preempt detected expedited stalls on 
> CPUs/tasks: { 2- } 21 jiffies s: 973 root: 0x4/.
> [   24.166938] rcu: blocking rcu_node structures (internal RCU debug):
> [   24.177800] Sending NMI from CPU 3 to CPUs 2:
> [   24.183113] NMI backtrace for cpu 2
> [   24.183119] CPU: 2 PID: 1049 Comm: NetworkManager Not tainted 
> 6.10.5-aosc-main #1
> [   24.183123] Hardware name: Apple Inc. MacBookPro6,2/Mac-F22586C8, BIOS
> MBP61.88Z.005D.B00.1804100943 04/10/18
> [   24.183125] RIP: 0010:__this_module+0x2d3d1/0x4f310 [tg3]
> [   24.183135] Code: c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 
> 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 89 f6 48 03 77 30 8b 06 <31> f6 
> 31 ff c3 cc cc cc cc 66 0f 1f 44 00 00 90 90 90 90 90 90 90
> [   24.183138] RSP: 0018:bf1a011d75e8 EFLAGS: 0082
> [   24.183141] RAX:  RBX: a04ec78f8a00 RCX: 
> 
> [   24.183143] RDX:  RSI: bf1a00fb007c RDI: 
> a04ec78f8a00
> [   24.183145] RBP: 0b50 R08:  R09: 
> 
> [   24.183147] R10:  R11:  R12: 
> 0216
> [   24.183148] R13: bf1a011d7624 R14: a04ec78f8a08 R15: 
> a04ec78f8b40
> [   24.183151] FS:  7f4c524b2140() GS:a05007d0() 
> knlGS:
> [   24.183153] CS:  0010 DS:  ES:  CR0: 80050033
> [   24.183155] CR2: 7f7025eae3e8 CR3: 0001040f8000 CR4: 
> 06f0
> [   24.183157] Call Trace:
> [   24.183162]  
> [   24.183167]  ? nmi_cpu_backtrace+0xbf/0x140
> [   24.183175]  ? nmi_cpu_backtrace_handler+0x11/0x20
> [   24.183181]  ? nmi_handle+0x61/0x160
> [   24.183186]  ? default_do_nmi+0x42/0x110
> [   24.183191]  ? exc_nmi+0x1bd/0x290
> [   24.183194]  ? end_repeat_nmi+0xf/0x53
> [   24.183203]  ? __this_module+0x2d3d1/0x4f310 [tg3]
> [   24.183207]  ? __this_module+0x2d3d1/0x4f310 [tg3]
> [   24.183210]  ? __this_module+0x2d3d1/0x4f310 [tg3]
> [   24.183213]  
> [   24.183214]  
> [   24.183215]  __this_module+0x31828/0x4f310 [tg3]
> [   24.183218]  ? __this_module+0x2d390/0x4f310 [tg3]
> [   24.183221]  __this_module+0x398e6/0x4f310 [tg3]
> [   24.183225]  __this_module+0x3baf8/0x4f310 [tg3]
> [   24.183229]  __this_module+0x4733f/0x4f310 [tg3]
> [   24.183233]  ? _raw_spin_unlock_irqrestore+0x25/0x70
> [   24.183237]  ? __this_module+0x398e6/0x4f310 [tg3]
> [   24.183241]  __this_module+0x4b943/0x4f310 [tg3]
> [   24.183244]  ? delay_tsc+0x89/0xf0
> [   24.183249]  ? preempt_count_sub+0x51/0x60
> [   24.183254]  __this_module+0x4be4b/0x4f310 [tg3]
> [   24.183258]  __dev_open+0x103/0x1c0
> [   24.183265]  __dev_change_flags+0x1bd/0x230
> [   24.183269]  ? rtnl_getlink+0x362/0x400
> [   24.183276]  dev_change_flags+0x26/0x70
> [   24.183280]  do_setlink+0xe16/0x11f0
> [   24.183286]  ? __nla_validate_parse+0x61/0xd40
> [   24.183295]  __rtnl_newlink+0x63d/0x9f0
> [   24.183301]  ? kmem_cache_alloc_node_noprof+0x12b/0x360
> [   24.183308]  ? kmalloc_trace_noprof+0x11e/0x350
> [   24.183312]  ? rtnl_newlink+0x2e/0x70
> [   24.183316]  rtnl_newlink+0x47/0x70
> [   24.183320]  rtnetlink_rcv_msg+0x152/0x400
> [   24.183324]  ? __netlink_sendskb+0x68/0x90
> [   24.183329]  ? netlink_unicast+0x237/0x290
> [   24.18]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
> [   24.183336]  netlink_rcv_skb+0x5b/0x110
> [   24.183343]  netlink_unicast+0x1a4/0x290
> [   24.183347]  netlink_sendmsg+0x222/0x4a0
> [   24.183350]  ? proc_get_long.constprop.0+0x116/0x210
> [   24.183358]  sys_sendmsg+0x379/0x3b0
> [   24.183363]  ? copy_msghdr_from_user+0x6d/0xb0
> [   24.183368]  ___sys_sendmsg+0x86/0xe0
> [   24.183372]  ? addrconf_sysctl_forward+0xf3/0x270
> [   24.183378]  ? _copy_from_iter+0x8b/0x570
> [   24.183384]  ? __pfx_addrconf_sysctl_forward+0x10/0x10
> [   24.183388]  ? _raw_spin_unlock+0x19/0x50
> [   24.183392]  ? proc_sys_call_handler+0xf3/0x2f0
> [   24.183397]  ? trace_hardirqs_on+0x29/0x90
> [   24.183401]  ? __fdget+0xc2/0xf0
> [   24.183405]  __sys_sendmsg+0x5b/0xc0
> [   24.183410]  ? syscall_trace_enter+0x110/0x1b0
> [   24.183416]  do_syscall_64+0x64/0x150
> [   24.183423]  entry_SYSCALL_64_after_hwframe+0x7

Re: [Regression] wifi problems since tg3 started throwing rcu stall warnings

2024-10-23 Thread Linux regression tracking (Thorsten Leemhuis)

[reply to self to CC the tg3 maintainers, netdev, and linux-wireless;
sorry, forgot them earlier, but they should be involved, as I guess this
is a problem in tg3 interfering with wifi drivers that the rcu change
just exposed]

On 23.10.24 10:27, Linux regression tracking (Thorsten Leemhuis) wrote:
> Hi, Thorsten here, the Linux kernel's regression tracker.
> 
> Frederic, I noticed a report about a regression in bugzilla.kernel.org
> that appears to be caused by the following change of yours:
> 
> 55d4669ef1b768 ("rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU
> invocation")
> 
> As many (most?) kernel developers don't keep an eye on the bug tracker,
> I decided to write this mail. To quote from
> https://bugzilla.kernel.org/show_bug.cgi?id=219390:
> 
>>  Mingcong Bai 2024-10-15 13:32:35 UTC
>>
>> Since aa162aa4aa383a0a714b1c36e8fcc77612ddd1a2 between v6.10.4 and
>> v6.10.5, the Broadcom Tigon3 Ethernet interface (tg3) found on Apple
>> MacBook Pro (15'', Mid 2010) would throw many rcu stall errors during
>> boot up, causing peripherals such as the wireless card to misbehave:
>>
>> [   24.153855] rcu: INFO: rcu_preempt detected expedited stalls on 
>> CPUs/tasks: { 2- } 21 jiffies s: 973 root: 0x4/.
>> [   24.166938] rcu: blocking rcu_node structures (internal RCU debug):
>> [   24.177800] Sending NMI from CPU 3 to CPUs 2:
>> [   24.183113] NMI backtrace for cpu 2
>> [   24.183119] CPU: 2 PID: 1049 Comm: NetworkManager Not tainted 
>> 6.10.5-aosc-main #1
>> [   24.183123] Hardware name: Apple Inc. MacBookPro6,2/Mac-F22586C8, BIOS
>> MBP61.88Z.005D.B00.1804100943 04/10/18
>> [   24.183125] RIP: 0010:__this_module+0x2d3d1/0x4f310 [tg3]
>> [   24.183135] Code: c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 
>> 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 89 f6 48 03 77 30 8b 06 <31> 
>> f6 31 ff c3 cc cc cc cc 66 0f 1f 44 00 00 90 90 90 90 90 90 90
>> [   24.183138] RSP: 0018:bf1a011d75e8 EFLAGS: 0082
>> [   24.183141] RAX:  RBX: a04ec78f8a00 RCX: 
>> 
>> [   24.183143] RDX:  RSI: bf1a00fb007c RDI: 
>> a04ec78f8a00
>> [   24.183145] RBP: 0b50 R08:  R09: 
>> 
>> [   24.183147] R10:  R11:  R12: 
>> 0216
>> [   24.183148] R13: bf1a011d7624 R14: a04ec78f8a08 R15: 
>> a04ec78f8b40
>> [   24.183151] FS:  7f4c524b2140() GS:a05007d0() 
>> knlGS:
>> [   24.183153] CS:  0010 DS:  ES:  CR0: 80050033
>> [   24.183155] CR2: 7f7025eae3e8 CR3: 0001040f8000 CR4: 
>> 06f0
>> [   24.183157] Call Trace:
>> [   24.183162]  
>> [   24.183167]  ? nmi_cpu_backtrace+0xbf/0x140
>> [   24.183175]  ? nmi_cpu_backtrace_handler+0x11/0x20
>> [   24.183181]  ? nmi_handle+0x61/0x160
>> [   24.183186]  ? default_do_nmi+0x42/0x110
>> [   24.183191]  ? exc_nmi+0x1bd/0x290
>> [   24.183194]  ? end_repeat_nmi+0xf/0x53
>> [   24.183203]  ? __this_module+0x2d3d1/0x4f310 [tg3]
>> [   24.183207]  ? __this_module+0x2d3d1/0x4f310 [tg3]
>> [   24.183210]  ? __this_module+0x2d3d1/0x4f310 [tg3]
>> [   24.183213]  
>> [   24.183214]  
>> [   24.183215]  __this_module+0x31828/0x4f310 [tg3]
>> [   24.183218]  ? __this_module+0x2d390/0x4f310 [tg3]
>> [   24.183221]  __this_module+0x398e6/0x4f310 [tg3]
>> [   24.183225]  __this_module+0x3baf8/0x4f310 [tg3]
>> [   24.183229]  __this_module+0x4733f/0x4f310 [tg3]
>> [   24.183233]  ? _raw_spin_unlock_irqrestore+0x25/0x70
>> [   24.183237]  ? __this_module+0x398e6/0x4f310 [tg3]
>> [   24.183241]  __this_module+0x4b943/0x4f310 [tg3]
>> [   24.183244]  ? delay_tsc+0x89/0xf0
>> [   24.183249]  ? preempt_count_sub+0x51/0x60
>> [   24.183254]  __this_module+0x4be4b/0x4f310 [tg3]
>> [   24.183258]  __dev_open+0x103/0x1c0
>> [   24.183265]  __dev_change_flags+0x1bd/0x230
>> [   24.183269]  ? rtnl_getlink+0x362/0x400
>> [   24.183276]  dev_change_flags+0x26/0x70
>> [   24.183280]  do_setlink+0xe16/0x11f0
>> [   24.183286]  ? __nla_validate_parse+0x61/0xd40
>> [   24.183295]  __rtnl_newlink+0x63d/0x9f0
>> [   24.183301]  ? kmem_cache_alloc_node_noprof+0x12b/0x360
>> [   24.183308]  ? kmalloc_trace_noprof+0x11e/0x350
>> [   24.183312]  ? rtnl_newlink+0x2e/0x70
>> [   24.183316]  rtnl_newlink+0x47/0x70
>> [   24.183320]  rtnetlink_rcv_msg+0x152/0x400
>> [   24.183324]  ? __netlink_sendskb+0x68/0x90
>> [   24.183329]  ? netlink_unicast+0x237/0x290
>> [   24.18]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
>> [   24.183336]  netlink_rcv_skb+0x5b/0x110
>> [   24.183343]  netlink_unicast+0x1a4/0x290
>> [   24.183347]  netlink_sendmsg+0x222/0x4a0
>> [   24.183350]  ? proc_get_long.constprop.0+0x116/0x210
>> [   24.183358]  sys_sendmsg+0x379/0x3b0
>> [   24.183363]  ? copy_msghdr_from_user+0x6d/0xb0
>> [   24.183368]  ___sys_sendmsg+0x86/0xe0
>> [   24.183372]  ? addrconf_sysctl_forward+0xf3/0x270
>> [   24.183378]  ? _copy_from_iter+0x8b/0x570
>> [   24.183384]  ? __pfx_ad

Re: [Regression] wifi problems since tg3 started throwing rcu stall warnings

2024-10-23 Thread Frederic Weisbecker

Hi Thorsten,

First, thanks for letting us know.

Le Wed, Oct 23, 2024 at 10:27:18AM +0200, Linux regression tracking (Thorsten 
Leemhuis) a écrit :
> Hi, Thorsten here, the Linux kernel's regression tracker.
> 
> Frederic, I noticed a report about a regression in bugzilla.kernel.org
> that appears to be caused by the following change of yours:
> 
> 55d4669ef1b768 ("rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU
> invocation")

Are you sure about the commit? Below it says:

> 
> As many (most?) kernel developers don't keep an eye on the bug tracker,
> I decided to write this mail. To quote from
> https://bugzilla.kernel.org/show_bug.cgi?id=219390:
> 
> >  Mingcong Bai 2024-10-15 13:32:35 UTC
> > 
> > Since aa162aa4aa383a0a714b1c36e8fcc77612ddd1a2 between v6.10.4 and

Now that's aa162aa4aa383a0a714b1c36e8fcc77612ddd1a2 which I can't find in 
vanilla
tree.

Also I'm failing to see an immediate issue between the below stacktrace
and the above commit. So are we sure about that reference?

Thanks.


> > v6.10.5, the Broadcom Tigon3 Ethernet interface (tg3) found on Apple
> > MacBook Pro (15'', Mid 2010) would throw many rcu stall errors during
> > boot up, causing peripherals such as the wireless card to misbehave:
> > 
> > [   24.153855] rcu: INFO: rcu_preempt detected expedited stalls on 
> > CPUs/tasks: { 2- } 21 jiffies s: 973 root: 0x4/.
> > [   24.166938] rcu: blocking rcu_node structures (internal RCU debug):
> > [   24.177800] Sending NMI from CPU 3 to CPUs 2:
> > [   24.183113] NMI backtrace for cpu 2
> > [   24.183119] CPU: 2 PID: 1049 Comm: NetworkManager Not tainted 
> > 6.10.5-aosc-main #1
> > [   24.183123] Hardware name: Apple Inc. MacBookPro6,2/Mac-F22586C8, BIOS   
> >  MBP61.88Z.005D.B00.1804100943 04/10/18
> > [   24.183125] RIP: 0010:__this_module+0x2d3d1/0x4f310 [tg3]
> > [   24.183135] Code: c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 
> > 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 89 f6 48 03 77 30 8b 06 
> > <31> f6 31 ff c3 cc cc cc cc 66 0f 1f 44 00 00 90 90 90 90 90 90 90
> > [   24.183138] RSP: 0018:bf1a011d75e8 EFLAGS: 0082
> > [   24.183141] RAX:  RBX: a04ec78f8a00 RCX: 
> > 
> > [   24.183143] RDX:  RSI: bf1a00fb007c RDI: 
> > a04ec78f8a00
> > [   24.183145] RBP: 0b50 R08:  R09: 
> > 
> > [   24.183147] R10:  R11:  R12: 
> > 0216
> > [   24.183148] R13: bf1a011d7624 R14: a04ec78f8a08 R15: 
> > a04ec78f8b40
> > [   24.183151] FS:  7f4c524b2140() GS:a05007d0() 
> > knlGS:
> > [   24.183153] CS:  0010 DS:  ES:  CR0: 80050033
> > [   24.183155] CR2: 7f7025eae3e8 CR3: 0001040f8000 CR4: 
> > 06f0
> > [   24.183157] Call Trace:
> > [   24.183162]  
> > [   24.183167]  ? nmi_cpu_backtrace+0xbf/0x140
> > [   24.183175]  ? nmi_cpu_backtrace_handler+0x11/0x20
> > [   24.183181]  ? nmi_handle+0x61/0x160
> > [   24.183186]  ? default_do_nmi+0x42/0x110
> > [   24.183191]  ? exc_nmi+0x1bd/0x290
> > [   24.183194]  ? end_repeat_nmi+0xf/0x53
> > [   24.183203]  ? __this_module+0x2d3d1/0x4f310 [tg3]
> > [   24.183207]  ? __this_module+0x2d3d1/0x4f310 [tg3]
> > [   24.183210]  ? __this_module+0x2d3d1/0x4f310 [tg3]
> > [   24.183213]  
> > [   24.183214]  
> > [   24.183215]  __this_module+0x31828/0x4f310 [tg3]
> > [   24.183218]  ? __this_module+0x2d390/0x4f310 [tg3]
> > [   24.183221]  __this_module+0x398e6/0x4f310 [tg3]
> > [   24.183225]  __this_module+0x3baf8/0x4f310 [tg3]
> > [   24.183229]  __this_module+0x4733f/0x4f310 [tg3]
> > [   24.183233]  ? _raw_spin_unlock_irqrestore+0x25/0x70
> > [   24.183237]  ? __this_module+0x398e6/0x4f310 [tg3]
> > [   24.183241]  __this_module+0x4b943/0x4f310 [tg3]
> > [   24.183244]  ? delay_tsc+0x89/0xf0
> > [   24.183249]  ? preempt_count_sub+0x51/0x60
> > [   24.183254]  __this_module+0x4be4b/0x4f310 [tg3]
> > [   24.183258]  __dev_open+0x103/0x1c0
> > [   24.183265]  __dev_change_flags+0x1bd/0x230
> > [   24.183269]  ? rtnl_getlink+0x362/0x400
> > [   24.183276]  dev_change_flags+0x26/0x70
> > [   24.183280]  do_setlink+0xe16/0x11f0
> > [   24.183286]  ? __nla_validate_parse+0x61/0xd40
> > [   24.183295]  __rtnl_newlink+0x63d/0x9f0
> > [   24.183301]  ? kmem_cache_alloc_node_noprof+0x12b/0x360
> > [   24.183308]  ? kmalloc_trace_noprof+0x11e/0x350
> > [   24.183312]  ? rtnl_newlink+0x2e/0x70
> > [   24.183316]  rtnl_newlink+0x47/0x70
> > [   24.183320]  rtnetlink_rcv_msg+0x152/0x400
> > [   24.183324]  ? __netlink_sendskb+0x68/0x90
> > [   24.183329]  ? netlink_unicast+0x237/0x290
> > [   24.18]  ? __pfx_rtnetlink_rcv_msg+0x10/0x10
> > [   24.183336]  netlink_rcv_skb+0x5b/0x110
> > [   24.183343]  netlink_unicast+0x1a4/0x290
> > [   24.183347]  netlink_sendmsg+0x222/0x4a0
> > [   24.183350]  ? proc_get_long.constprop.0+0x116/0x210
> > [   24.183358]  sys_sendmsg+0x379/0x3b0
> > [   24.183363]

Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.

2024-10-23 Thread Frederic Weisbecker

Le Wed, Oct 23, 2024 at 08:30:14AM +0200, Sebastian Andrzej Siewior a écrit :
> On 2024-10-23 00:27:34 [+0200], Frederic Weisbecker wrote:
> > > Try again without the "ksoftirqd will collect it all" since this won't
> > > happen since the revert I mentioned.
> > 
> > I still don't get it, this makes:
> > 
> > """
> > Once the ksoftirqd is marked as pending (or is running), a softirq which
> > would have been processed at the end of the threaded interrupt, which runs
> > at an elevated priority, is now moved to ksoftirqd which runs at SCHED_OTHER
> > priority and competes with every regular task for CPU resources.
> > """
> > 
> > ksoftirqd raised for timers still doesn't prevent a threaded IRQ from 
> > running
> > softirqs, unless it preempts ksoftirqd and waits with PI. So is it what 
> > you're
> > trying to solve?
> > 
> > Or is the problem that timer softirqs are executed with SCHED_NORMAL?
> 
> Exactly. It runs at SCHED_NORMAL and competes with everything else. It
> can delay tasks wakes depending on system load.

Ok so that narrows down the problem and it's much clearer, thanks.

> > > > > +void raise_timer_softirq(void)
> > > > > +{
> > > > > + unsigned long flags;
> > > > > +
> > > > > + local_irq_save(flags);
> > > > > + raise_ktimers_thread(TIMER_SOFTIRQ);
> > > > > + wake_timersd();
> > > > 
> > > > This is supposed to be called from hardirq only, right?
> > > > Can't irq_exit_rcu() take care of it? Why is it different
> > > > from HRTIMER_SOFTIRQ ?
> > > 
> > > Good question. This shouldn't be any different compared to the hrtimer
> > > case. This is only raised in hardirq, so yes, the irq_save can go away
> > > and the wake call, too.
> > 
> > Cool. You can add lockdep_assert_in_irq() within raise_ktimers_thread() for
> > some well deserved relief :-)
> 
> If you want to, sure. I would add them to both raise functions.

Yeah, just in case. Thanks!

> 
> > Thanks.
> 
> Sebastian

Re: [Regression] wifi problems since tg3 started throwing rcu stall warnings

2024-10-23 Thread Linux regression tracking (Thorsten Leemhuis)

On 23.10.24 12:09, Frederic Weisbecker wrote:
> Le Wed, Oct 23, 2024 at 10:27:18AM +0200, Linux regression tracking (Thorsten 
> Leemhuis) a écrit :
>> Hi, Thorsten here, the Linux kernel's regression tracker.
>>
>> Frederic, I noticed a report about a regression in bugzilla.kernel.org
>> that appears to be caused by the following change of yours:
>>
>> 55d4669ef1b768 ("rcu: Fix rcu_barrier() VS post CPUHP_TEARDOWN_CPU
>> invocation")

> Are you sure about the commit? Below it says:

Not totally, but...

>> As many (most?) kernel developers don't keep an eye on the bug tracker,
>> I decided to write this mail. To quote from
>> https://bugzilla.kernel.org/show_bug.cgi?id=219390:
>>
>>>  Mingcong Bai 2024-10-15 13:32:35 UTC
>>>
>>> Since aa162aa4aa383a0a714b1c36e8fcc77612ddd1a2 between v6.10.4 and
> 
> Now that's aa162aa4aa383a0a714b1c36e8fcc77612ddd1a2 which I can't find in 
> vanilla
> tree.

...quite, as that is the commit-id of the backport to v6.10.5; and the
reporter reverted it there. Ideally of course that would have happened
on recent mainline. If you doubt, ask Mingcong Bai to check if a revert
there helps, too.

HTH, Ciao, Thorsten

> Also I'm failing to see an immediate issue between the below stacktrace
> and the above commit. So are we sure about that reference?
> 
> Thanks.
> 
> 
>>> v6.10.5, the Broadcom Tigon3 Ethernet interface (tg3) found on Apple
>>> MacBook Pro (15'', Mid 2010) would throw many rcu stall errors during
>>> boot up, causing peripherals such as the wireless card to misbehave:
>>>
>>> [   24.153855] rcu: INFO: rcu_preempt detected expedited stalls on 
>>> CPUs/tasks: { 2- } 21 jiffies s: 973 root: 0x4/.
>>> [   24.166938] rcu: blocking rcu_node structures (internal RCU debug):
>>> [   24.177800] Sending NMI from CPU 3 to CPUs 2:
>>> [   24.183113] NMI backtrace for cpu 2
>>> [   24.183119] CPU: 2 PID: 1049 Comm: NetworkManager Not tainted 
>>> 6.10.5-aosc-main #1
>>> [   24.183123] Hardware name: Apple Inc. MacBookPro6,2/Mac-F22586C8, BIOS   
>>>  MBP61.88Z.005D.B00.1804100943 04/10/18
>>> [   24.183125] RIP: 0010:__this_module+0x2d3d1/0x4f310 [tg3]
>>> [   24.183135] Code: c3 cc cc cc cc 0f 1f 40 00 90 90 90 90 90 90 90 90 90 
>>> 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 89 f6 48 03 77 30 8b 06 
>>> <31> f6 31 ff c3 cc cc cc cc 66 0f 1f 44 00 00 90 90 90 90 90 90 90
>>> [   24.183138] RSP: 0018:bf1a011d75e8 EFLAGS: 0082
>>> [   24.183141] RAX:  RBX: a04ec78f8a00 RCX: 
>>> 
>>> [   24.183143] RDX:  RSI: bf1a00fb007c RDI: 
>>> a04ec78f8a00
>>> [   24.183145] RBP: 0b50 R08:  R09: 
>>> 
>>> [   24.183147] R10:  R11:  R12: 
>>> 0216
>>> [   24.183148] R13: bf1a011d7624 R14: a04ec78f8a08 R15: 
>>> a04ec78f8b40
>>> [   24.183151] FS:  7f4c524b2140() GS:a05007d0() 
>>> knlGS:
>>> [   24.183153] CS:  0010 DS:  ES:  CR0: 80050033
>>> [   24.183155] CR2: 7f7025eae3e8 CR3: 0001040f8000 CR4: 
>>> 06f0
>>> [   24.183157] Call Trace:
>>> [   24.183162]  
>>> [   24.183167]  ? nmi_cpu_backtrace+0xbf/0x140
>>> [   24.183175]  ? nmi_cpu_backtrace_handler+0x11/0x20
>>> [   24.183181]  ? nmi_handle+0x61/0x160
>>> [   24.183186]  ? default_do_nmi+0x42/0x110
>>> [   24.183191]  ? exc_nmi+0x1bd/0x290
>>> [   24.183194]  ? end_repeat_nmi+0xf/0x53
>>> [   24.183203]  ? __this_module+0x2d3d1/0x4f310 [tg3]
>>> [   24.183207]  ? __this_module+0x2d3d1/0x4f310 [tg3]
>>> [   24.183210]  ? __this_module+0x2d3d1/0x4f310 [tg3]
>>> [   24.183213]  
>>> [   24.183214]  
>>> [   24.183215]  __this_module+0x31828/0x4f310 [tg3]
>>> [   24.183218]  ? __this_module+0x2d390/0x4f310 [tg3]
>>> [   24.183221]  __this_module+0x398e6/0x4f310 [tg3]
>>> [   24.183225]  __this_module+0x3baf8/0x4f310 [tg3]
>>> [   24.183229]  __this_module+0x4733f/0x4f310 [tg3]
>>> [   24.183233]  ? _raw_spin_unlock_irqrestore+0x25/0x70
>>> [   24.183237]  ? __this_module+0x398e6/0x4f310 [tg3]
>>> [   24.183241]  __this_module+0x4b943/0x4f310 [tg3]
>>> [   24.183244]  ? delay_tsc+0x89/0xf0
>>> [   24.183249]  ? preempt_count_sub+0x51/0x60
>>> [   24.183254]  __this_module+0x4be4b/0x4f310 [tg3]
>>> [   24.183258]  __dev_open+0x103/0x1c0
>>> [   24.183265]  __dev_change_flags+0x1bd/0x230
>>> [   24.183269]  ? rtnl_getlink+0x362/0x400
>>> [   24.183276]  dev_change_flags+0x26/0x70
>>> [   24.183280]  do_setlink+0xe16/0x11f0
>>> [   24.183286]  ? __nla_validate_parse+0x61/0xd40
>>> [   24.183295]  __rtnl_newlink+0x63d/0x9f0
>>> [   24.183301]  ? kmem_cache_alloc_node_noprof+0x12b/0x360
>>> [   24.183308]  ? kmalloc_trace_noprof+0x11e/0x350
>>> [   24.183312]  ? rtnl_newlink+0x2e/0x70
>>> [   24.183316]  rtnl_newlink+0x47/0x70
>>> [   24.183320]  rtnetlink_rcv_msg+0x152/0x400
>>> [   24.183324]  ? __netlink_sendskb+0x68/0x90
>>> [   24.183329]  ? netlink_unicast+0x237/0x290
>>> [   24.18]  ? __p

[PATCH] sched_ext: Fix function pointer type mismatches in BPF selftests

2024-10-23 Thread Vishal Chourasia

Fix incompatible function pointer type warnings in sched_ext BPF selftests by
explicitly casting the function pointers when initializing struct_ops.
This addresses multiple -Wincompatible-function-pointer-types warnings from the
clang compiler where function signatures didn't match exactly.

The void * cast ensures the compiler accepts the function pointer
assignment despite minor type differences in the parameters.
---
 .../selftests/sched_ext/create_dsq.bpf.c  |  6 +-
 .../sched_ext/ddsp_bogus_dsq_fail.bpf.c   |  4 +-
 .../sched_ext/ddsp_vtimelocal_fail.bpf.c  |  4 +-
 .../selftests/sched_ext/dsp_local_on.bpf.c|  8 +--
 .../sched_ext/enq_select_cpu_fails.bpf.c  |  4 +-
 tools/testing/selftests/sched_ext/exit.bpf.c  | 14 ++---
 .../testing/selftests/sched_ext/hotplug.bpf.c |  8 +--
 .../sched_ext/init_enable_count.bpf.c |  8 +--
 .../testing/selftests/sched_ext/maximal.bpf.c | 58 +--
 .../selftests/sched_ext/maybe_null.bpf.c  |  6 +-
 .../sched_ext/maybe_null_fail_dsp.bpf.c   |  4 +-
 .../sched_ext/maybe_null_fail_yld.bpf.c   |  4 +-
 .../selftests/sched_ext/prog_run.bpf.c|  2 +-
 .../selftests/sched_ext/select_cpu_dfl.bpf.c  |  2 +-
 .../sched_ext/select_cpu_dfl_nodispatch.bpf.c |  6 +-
 .../sched_ext/select_cpu_dispatch.bpf.c   |  2 +-
 .../select_cpu_dispatch_bad_dsq.bpf.c |  4 +-
 .../select_cpu_dispatch_dbl_dsp.bpf.c |  4 +-
 .../sched_ext/select_cpu_vtime.bpf.c  | 12 ++--
 19 files changed, 80 insertions(+), 80 deletions(-)

diff --git a/tools/testing/selftests/sched_ext/create_dsq.bpf.c 
b/tools/testing/selftests/sched_ext/create_dsq.bpf.c
index 23f79ed343f02..2cfc4ffd60e28 100644
--- a/tools/testing/selftests/sched_ext/create_dsq.bpf.c
+++ b/tools/testing/selftests/sched_ext/create_dsq.bpf.c
@@ -51,8 +51,8 @@ s32 BPF_STRUCT_OPS_SLEEPABLE(create_dsq_init)
 
 SEC(".struct_ops.link")
 struct sched_ext_ops create_dsq_ops = {
-   .init_task  = create_dsq_init_task,
-   .exit_task  = create_dsq_exit_task,
-   .init   = create_dsq_init,
+   .init_task  = (void *) create_dsq_init_task,
+   .exit_task  = (void *) create_dsq_exit_task,
+   .init   = (void *) create_dsq_init,
.name   = "create_dsq",
 };
diff --git a/tools/testing/selftests/sched_ext/ddsp_bogus_dsq_fail.bpf.c 
b/tools/testing/selftests/sched_ext/ddsp_bogus_dsq_fail.bpf.c
index e97ad41d354ad..37d9bf6fb7458 100644
--- a/tools/testing/selftests/sched_ext/ddsp_bogus_dsq_fail.bpf.c
+++ b/tools/testing/selftests/sched_ext/ddsp_bogus_dsq_fail.bpf.c
@@ -35,8 +35,8 @@ void BPF_STRUCT_OPS(ddsp_bogus_dsq_fail_exit, struct 
scx_exit_info *ei)
 
 SEC(".struct_ops.link")
 struct sched_ext_ops ddsp_bogus_dsq_fail_ops = {
-   .select_cpu = ddsp_bogus_dsq_fail_select_cpu,
-   .exit   = ddsp_bogus_dsq_fail_exit,
+   .select_cpu = (void *) ddsp_bogus_dsq_fail_select_cpu,
+   .exit   = (void *) ddsp_bogus_dsq_fail_exit,
.name   = "ddsp_bogus_dsq_fail",
.timeout_ms = 1000U,
 };
diff --git a/tools/testing/selftests/sched_ext/ddsp_vtimelocal_fail.bpf.c 
b/tools/testing/selftests/sched_ext/ddsp_vtimelocal_fail.bpf.c
index dde7e7dafbfbc..dffc97d9cdf14 100644
--- a/tools/testing/selftests/sched_ext/ddsp_vtimelocal_fail.bpf.c
+++ b/tools/testing/selftests/sched_ext/ddsp_vtimelocal_fail.bpf.c
@@ -32,8 +32,8 @@ void BPF_STRUCT_OPS(ddsp_vtimelocal_fail_exit, struct 
scx_exit_info *ei)
 
 SEC(".struct_ops.link")
 struct sched_ext_ops ddsp_vtimelocal_fail_ops = {
-   .select_cpu = ddsp_vtimelocal_fail_select_cpu,
-   .exit   = ddsp_vtimelocal_fail_exit,
+   .select_cpu = (void *) ddsp_vtimelocal_fail_select_cpu,
+   .exit   = (void *) ddsp_vtimelocal_fail_exit,
.name   = "ddsp_vtimelocal_fail",
.timeout_ms = 1000U,
 };
diff --git a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c 
b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
index efb4672decb41..6a7db1502c29e 100644
--- a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
+++ b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
@@ -56,10 +56,10 @@ void BPF_STRUCT_OPS(dsp_local_on_exit, struct scx_exit_info 
*ei)
 
 SEC(".struct_ops.link")
 struct sched_ext_ops dsp_local_on_ops = {
-   .select_cpu = dsp_local_on_select_cpu,
-   .enqueue= dsp_local_on_enqueue,
-   .dispatch   = dsp_local_on_dispatch,
-   .exit   = dsp_local_on_exit,
+   .select_cpu = (void *) dsp_local_on_select_cpu,
+   .enqueue= (void *) dsp_local_on_enqueue,
+   .dispatch   = (void *) dsp_local_on_dispatch,
+   .exit   = (void *) dsp_local_on_exit,

Re: [PATCH 1/1] softirq: Use a dedicated thread for timer wakeups on PREEMPT_RT.

2024-10-23 Thread Sebastian Andrzej Siewior

On 2024-10-23 08:30:18 [+0200], To Frederic Weisbecker wrote:
> > > > > +void raise_timer_softirq(void)
> > > > > +{
> > > > > + unsigned long flags;
> > > > > +
> > > > > + local_irq_save(flags);
> > > > > + raise_ktimers_thread(TIMER_SOFTIRQ);
> > > > > + wake_timersd();
> > > > 
> > > > This is supposed to be called from hardirq only, right?
> > > > Can't irq_exit_rcu() take care of it? Why is it different
> > > > from HRTIMER_SOFTIRQ ?
> > > 
> > > Good question. This shouldn't be any different compared to the hrtimer
> > > case. This is only raised in hardirq, so yes, the irq_save can go away
> > > and the wake call, too.
> > 
> > Cool. You can add lockdep_assert_in_irq() within raise_ktimers_thread() for
> > some well deserved relief :-)
> 
> If you want to, sure. I would add them to both raise functions.

That function (run_local_timers()) was in past also called from other
places like the APIC IRQ but all this is gone now. The reason why I
added the wake and the local_irq_save() is because it uses
raise_softirq() instead raise_softirq_irqoff(). And raise_softirq() was
used since it was separated away from tasklets.

Now, raise_timer_softirq() is a function within softirq.c because it
needs to access task_struct timersd which was only accessible there. It
has been made public later due to the rcutorture bits so it could be
very much be made inline and reduced to just raise_ktimers_thread().
I tend to make TIMER_SOFTIRQ use also raise_softirq_irqoff() to make it
look the same. That lockdep_assert_in_irq() is probably cheap but it
might look odd why RT needs or just TIMER and not HRTIMER.

> > Thanks.

Sebastian

Re: [PATCH rcu] srcu: Guarantee non-negative return value from srcu_read_lock()

2024-10-23 Thread Alan Huang

On Oct 22, 2024, at 22:26, Paul E. McKenney  wrote:
> 
> On Tue, Oct 22, 2024 at 12:13:12AM -0700, Christoph Hellwig wrote:
>> On Tue, Oct 22, 2024 at 09:10:18AM +0200, Peter Zijlstra wrote:
>>> Ah, well, the thing that got us here is that we (Andrii and me) wanted
>>> to use -1 as an 'invalid' value to indicate SRCU is not currently in
>>> use.
>>> 
>>> So it all being int is really rather convenient :-)
>> 
>> Then please document that use.  Maybe even with a symolic name for
>> -1 that clearly describes these uses.
> 
> Would this work?
> 
> #define SRCU_INVALID_INDEX -1

Is there any similar guarantee of the return value of get_state_synchronize_rcu
or start_poll_synchronize_rcu, like invalid value?

> 
> Whatever the name, maybe Peter and Andrii define this under #ifndef
> right now, and we get it into include/linux/srcu.h over time.
> 
> Or is there a better way?  Or name, for that matter.
> 
> Thanx, Paul
>

Re: [PATCH v4 2/4] pidfd: add PIDFD_SELF_* sentinels to refer to own thread/process

2024-10-23 Thread Lorenzo Stoakes

On Tue, Oct 22, 2024 at 05:53:00PM -0700, Shakeel Butt wrote:
> On Thu, Oct 17, 2024 at 10:05:50PM GMT, Lorenzo Stoakes wrote:
> > It is useful to be able to utilise the pidfd mechanism to reference the
> > current thread or process (from a userland point of view - thread group
> > leader from the kernel's point of view).
> >
> > Therefore introduce PIDFD_SELF_THREAD to refer to the current thread, and
> > PIDFD_SELF_THREAD_GROUP to refer to the current thread group leader.
> >
> > For convenience and to avoid confusion from userland's perspective we alias
> > these:
> >
> > * PIDFD_SELF is an alias for PIDFD_SELF_THREAD - This is nearly always what
> >   the user will want to use, as they would find it surprising if for
> >   instance fd's were unshared()'d and they wanted to invoke pidfd_getfd()
> >   and that failed.
> >
> > * PIDFD_SELF_PROCESS is an alias for PIDFD_SELF_THREAD_GROUP - Most users
> >   have no concept of thread groups or what a thread group leader is, and
> >   from userland's perspective and nomenclature this is what userland
> >   considers to be a process.
>
> Should users use PIDFD_SELF_PROCESS in process_madvise() for self
> madvise() (once the support is added)?

You can use either it will make no difference as both will get you to
current->mm which has to be shared. So I'd go with PIDFD_SELF for brevity
:)

This series and the prerequisites I already added to process_madvise()
already provide support so with this in you can just use this for that.

>
> >
> [...]
> >
> > +static struct pid *pidfd_get_pid_self(unsigned int pidfd, unsigned int 
> > *flags)
> > +{
> > +   bool is_thread = pidfd == PIDFD_SELF_THREAD;
> > +   enum pid_type type = is_thread ? PIDTYPE_PID : PIDTYPE_TGID;
> > +   struct pid *pid = *task_pid_ptr(current, type);
> > +
> > +   /* The caller expects an elevated reference count. */
> > +   get_pid(pid);
>
> Do you want this helper to work for scenarios where pid is used across
> context? Otherwise can't we get rid of this get and later put for self?

Yeah I hate doing this but it's to keep things as generic as possible and
to ensure that no user of this logic accidentally drops the reference count
unintentionally + to reduce churn.

I think doing it this way is better than trying to special case the put,
and in any case if you did the 'old' way of referencing yourself you'd have
ended up doing this anyway so it's at least no delta!

>
> > +   return pid;
> > +}
> > +
>
> Overall looks good to me.
>
> Reviewed-by: Shakeel Butt 

Thanks!

Re: [PATCH v1 04/10] iommufd/viommu: Allow drivers to control vdev_id lifecycle

2024-10-23 Thread Nicolin Chen

On Thu, Sep 05, 2024 at 03:01:19PM -0300, Jason Gunthorpe wrote:
> On Tue, Aug 27, 2024 at 10:02:06AM -0700, Nicolin Chen wrote:
> > The iommufd core provides a lookup helper for an IOMMU driver to find a
> > device pointer by device's per-viommu virtual ID. Yet a driver may need
> > an inverted lookup to find a device's per-viommu virtual ID by a device
> > pointer, e.g. when reporting virtual IRQs/events back to the user space.
> > In this case, it'd be unsafe for iommufd core to do an inverted lookup,
> > as the driver can't track the lifecycle of a viommu object or a vdev_id
> > object.
> > 
> > Meanwhile, some HW can even support virtual device ID lookup by its HW-
> > accelerated virtualization feature. E.g. Tegra241 CMDQV HW supports to
> > execute vanilla guest-issued SMMU commands containing virtual Stream ID
> > but requires software to configure a link between virtual Stream ID and
> > physical Stream ID via HW registers. So not only the iommufd core needs
> > a vdev_id lookup table, drivers will want one too.
> > 
> > Given the two justifications above, it's the best practice to provide a
> > a pair of set_vdev_id/unset_vdev_id ops in the viommu ops, so a driver
> > can implement them to control a vdev_id's lifecycle, and configure the
> > HW properly if required.
> 
> I think the lifecycle rules should be much simpler.
> 
> If a nested domain is attached to a STE/RID/device then the vIOMMU
> affiliated with that nested domain is pinned while the STE is in place
> 
> So the driver only need to provide locking around attach changing the
> STE's vIOMMU vs async operations translating from a STE to a
> vIOMMU. This can be a simple driver lock of some kind, ie a rwlock
> across the STE table.
> 
> Generally that is how all the async events should work, go from the
> STE to the VIOMMU to a iommufd callback to the iommufd event
> queue. iommufd will translate the struct device from the driver to an
> idev_id (or maybe even a vdevid) the same basic way the PRI code works

I am trying to draft something following this, and here is what
it would look like:

draft---
struct arm_smmu_master {

+   struct rw_semaphore lock;
+   struct arm_vsmmu *vsmmu;

};

->attach_dev() {
down_write(&master->lock);
if (domain->type == IOMMU_DOMAIN_NESTED)
master->vsmmu = to_smmu_domain(domain)->vsmmu;
else
master->vsmmu = NULL;
up_write(&master->lock);
}

isr() {
down_read(&master->lock);
if (master->vsmmu) {
xa_lock(&master->vsmmu->core.vdevs);
vdev = iommufd_viommu_dev_to_vdev(&master->vsmmu->core,
  master->dev);
if (vdev) {
struct iommu_virq_arm_smmuv3 virq_data = evt;

virq_data.evt[0] &= ~EVTQ_0_SID;
virq_data.evt[0] |= FIELD_PREP(EVTQ_0_SID, vdev->id);
return iommufd_viommu_report_irq(
vdev->viommu,
IOMMU_VIRQ_TYPE_ARM_SMMUV3, &virq_data,
sizeof(virq_data));
} else {
rc = -ENOENT;
}
xa_unlock(&master->vsmmu->core.vdevs);
}
up_read(&master->lock);
}

[Comparison]  | This v1 | Draft
1. Adds to master | A lock and vdev ptr | A lock and viommu ptr
2. Set/unset ptr  | In ->vdevice_alloc/free | In all ->attach_dev
3. Do dev_to_vdev | master->vdev->id| attach_handle?

Both solutions needs a driver-level lock and an extra pointer in
the master structure. And both ISR routines require that driver-
level lock to avoid race against attach_dev v.s. vdev alloc/free.
Overall, taking step.3 into consideration, it seems that letting
master lock&hold the vdev pointer (i.e. this v1) is simpler?

As for the implementation of iommufd_viommu_dev_to_vdev(), I read
the attach_handle part in the PRI code, yet I don't see the lock
that protects the handle returned by iommu_attach_handle_get() in
iommu_report_device_fault/find_fault_handler().

And I see the kdoc of iommu_attach_handle_get() mentioning:
  "Callers are required to synchronize the call of
   iommu_attach_handle_get() with domain attachment
   and detachment. The attach handle can only be used
   during its life cycle."
But the caller iommu_report_device_fault() is an async event that
cannot guarantee the lifecycle. Would you please shed some light?

Thanks
Nicolin

Re: [PATCH] kselftest/arm64: Log fp-stress child startup errors to stdout

2024-10-23 Thread Catalin Marinas

On Wed, 23 Oct 2024 00:20:45 +0100, Mark Brown wrote:
> Currently if we encounter an error between fork() and exec() of a child
> process we log the error to stderr. This means that the errors don't get
> annotated with the child information which makes diagnostics harder and
> means that if we miss the exit signal from the child we can deadlock
> waiting for output from the child. Improve robustness and output quality
> by logging to stdout instead.
> 
> [...]

Applied to arm64 (for-next/kselftest), thanks!

[1/1] kselftest/arm64: Log fp-stress child startup errors to stdout
  https://git.kernel.org/arm64/c/dca93d29845d

-- 
Catalin

Re: [PATCH v4 1/2] virt: pvmemcontrol: control guest physical memory properties

2024-10-23 Thread kernel test robot

Hi Yuanchu,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.12-rc4 next-20241023]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Yuanchu-Xie/virt-pvmemcontrol-add-Yuanchu-and-Pasha-as-maintainers/20241022-045035
base:   linus/master
patch link:
https://lore.kernel.org/r/20241021204849.1580384-1-yuanchu%40google.com
patch subject: [PATCH v4 1/2] virt: pvmemcontrol: control guest physical memory 
properties
config: i386-allmodconfig 
(https://download.01.org/0day-ci/archive/20241023/202410231908.jzfkawyd-...@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20241023/202410231908.jzfkawyd-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202410231908.jzfkawyd-...@intel.com/

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "per_cpu_ptr_to_phys" 
>> [drivers/virt/pvmemcontrol/pvmemcontrol.ko] undefined!

Kconfig warnings: (for reference only)
   WARNING: unmet direct dependencies detected for GET_FREE_REGION
   Depends on [n]: SPARSEMEM [=n]
   Selected by [m]:
   - RESOURCE_KUNIT_TEST [=m] && RUNTIME_TESTING_MENU [=y] && KUNIT [=m]

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Re: [PATCH net 1/3] mptcp: init: protect sched with rcu_read_lock

2024-10-23 Thread Simon Horman

On Mon, Oct 21, 2024 at 12:25:26PM +0200, Matthieu Baerts (NGI0) wrote:
> Enabling CONFIG_PROVE_RCU_LIST with its dependence CONFIG_RCU_EXPERT
> creates this splat when an MPTCP socket is created:
> 
>   =
>   WARNING: suspicious RCU usage
>   6.12.0-rc2+ #11 Not tainted
>   -
>   net/mptcp/sched.c:44 RCU-list traversed in non-reader section!!
> 
>   other info that might help us debug this:
> 
>   rcu_scheduler_active = 2, debug_locks = 1
>   no locks held by mptcp_connect/176.
> 
>   stack backtrace:
>   CPU: 0 UID: 0 PID: 176 Comm: mptcp_connect Not tainted 6.12.0-rc2+ #11
>   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>   Call Trace:
>
>dump_stack_lvl (lib/dump_stack.c:123)
>lockdep_rcu_suspicious (kernel/locking/lockdep.c:6822)
>mptcp_sched_find (net/mptcp/sched.c:44 (discriminator 7))
>mptcp_init_sock (net/mptcp/protocol.c:2867 (discriminator 1))
>? sock_init_data_uid (arch/x86/include/asm/atomic.h:28)
>inet_create.part.0.constprop.0 (net/ipv4/af_inet.c:386)
>? __sock_create (include/linux/rcupdate.h:347 (discriminator 1))
>__sock_create (net/socket.c:1576)
>__sys_socket (net/socket.c:1671)
>? __pfx___sys_socket (net/socket.c:1712)
>? do_user_addr_fault (arch/x86/mm/fault.c:1419 (discriminator 1))
>__x64_sys_socket (net/socket.c:1728)
>do_syscall_64 (arch/x86/entry/common.c:52 (discriminator 1))
>entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> 
> That's because when the socket is initialised, rcu_read_lock() is not
> used despite the explicit comment written above the declaration of
> mptcp_sched_find() in sched.c. Adding the missing lock/unlock avoids the
> warning.
> 
> Fixes: 1730b2b2c5a5 ("mptcp: add sched in mptcp_sock")
> Cc: sta...@vger.kernel.org
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/523
> Reviewed-by: Geliang Tang 
> Signed-off-by: Matthieu Baerts (NGI0) 

Reviewed-by: Simon Horman

Re: [PATCH net 3/3] selftests: mptcp: list sysctl data

2024-10-23 Thread Simon Horman

On Mon, Oct 21, 2024 at 12:25:28PM +0200, Matthieu Baerts (NGI0) wrote:
> Listing all the values linked to the MPTCP sysctl knobs was not
> exercised in MPTCP test suite.
> 
> Let's do that to avoid any regressions, but also to have a kernel with a
> debug kconfig verifying more assumptions. For the moment, we are not
> interested by the output, only to avoid crashes and warnings.
> 
> Signed-off-by: Matthieu Baerts (NGI0) 

I am assuming that we are ok with expanding test coverage via net,
which FWIIW, does seem reasonable to me.

Reviewed-by: Simon Horman 

...

Re: [PATCH net 2/3] mptcp: remove unneeded lock when listing scheds

2024-10-23 Thread Simon Horman

On Mon, Oct 21, 2024 at 12:25:27PM +0200, Matthieu Baerts (NGI0) wrote:
> mptcp_get_available_schedulers() needs to iterate over the schedulers'
> list only to read the names: it doesn't modify anything there.
> 
> In this case, it is enough to hold the RCU read lock, no need to combine
> this with the associated spin lock.
> 
> Fixes: 73c900aa3660 ("mptcp: add net.mptcp.available_schedulers")
> Cc: sta...@vger.kernel.org
> Suggested-by: Paolo Abeni 
> Reviewed-by: Geliang Tang 
> Signed-off-by: Matthieu Baerts (NGI0) 

I do wonder if it would be more appropriate to route this via net-next
(without a fixes tag) rather than via net. But either way this looks good
to me.

Reviewed-by: Simon Horman 

...

Re: [RFC PATCH] remoteproc: core: Add support for predefined notifyids

2024-10-23 Thread Mathieu Poirier

Hello Daniel,

On Fri, Oct 18, 2024 at 02:09:29PM +0300, Daniel Baluta wrote:
> Currently we generate notifyids in the linux kernel and override
> those found in rsc_table.
> 
> This doesn't play well with users expecting to use the exact ids
> from rsc_table.
> 
> So, use predefined notifyids found in rsc_table if any. Otherwise,
> let Linux generate the ids as before.
> 
> Keypoint is we also define an invalid notifid as 0xU. This
> should be placed as notifids if users want Linux to generate the ids.
> 
> Signed-off-by: Alexandru Lastur 
> Signed-off-by: Daniel Baluta 
> ---
>  drivers/remoteproc/remoteproc_core.c | 14 --
>  include/linux/remoteproc.h   |  1 +
>  2 files changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/remoteproc/remoteproc_core.c 
> b/drivers/remoteproc/remoteproc_core.c
> index f276956f2c5c..9f00fe16da38 100644
> --- a/drivers/remoteproc/remoteproc_core.c
> +++ b/drivers/remoteproc/remoteproc_core.c
> @@ -332,6 +332,7 @@ int rproc_alloc_vring(struct rproc_vdev *rvdev, int i)
>   int ret, notifyid;
>   struct rproc_mem_entry *mem;
>   size_t size;
> + int start, end;
>  
>   /* actual size of vring (in bytes) */
>   size = PAGE_ALIGN(vring_size(rvring->num, rvring->align));
> @@ -363,9 +364,18 @@ int rproc_alloc_vring(struct rproc_vdev *rvdev, int i)
>   /*
>* Assign an rproc-wide unique index for this vring
>* TODO: assign a notifyid for rvdev updates as well
> -  * TODO: support predefined notifyids (via resource table)
>*/
> - ret = idr_alloc(&rproc->notifyids, rvring, 0, 0, GFP_KERNEL);
> +
> + start = 0;
> + end = 0;
> +
> + /* use id if specified in rsc table */
> + if (rsc->vring[i].notifyid != RSC_INVALID_NOTIFYID) {
> + start = rsc->vring[i].notifyid;
> + end = start + 1;
> + }

This will likely introduce a backward compatibility issue where anyone that
has more than one vring and set their notifyids to anything else than 0x
in the resource table will see a boot failure.

A while back the openAMP group started discussions on using the configuration
space of a virtio device to enhance device discovery, with exactly this kind of
use case in mind.  I think it is the only way to move forward with this
feature, though it is a big job that requires a lot of community interactions.

Regards,
Mathieu

> +
> + ret = idr_alloc(&rproc->notifyids, rvring, start, end, GFP_KERNEL);
>   if (ret < 0) {
>   dev_err(dev, "idr_alloc failed: %d\n", ret);
>   return ret;
> diff --git a/include/linux/remoteproc.h b/include/linux/remoteproc.h
> index b4795698d8c2..98c3e181086e 100644
> --- a/include/linux/remoteproc.h
> +++ b/include/linux/remoteproc.h
> @@ -238,6 +238,7 @@ struct fw_rsc_trace {
>   u8 name[32];
>  } __packed;
>  
> +#define RSC_INVALID_NOTIFYID 0xU
>  /**
>   * struct fw_rsc_vdev_vring - vring descriptor entry
>   * @da: device address
> -- 
> 2.43.0
>

[PATCH net-next v5 01/12] selftests: ncdevmem: Redirect all non-payload output to stderr

2024-10-23 Thread Stanislav Fomichev

That should make it possible to do expected payload validation on
the caller side.

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/net/ncdevmem.c | 61 +-
 1 file changed, 30 insertions(+), 31 deletions(-)

diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
index 64d6805381c5..9245d3f158dd 100644
--- a/tools/testing/selftests/net/ncdevmem.c
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -88,7 +88,6 @@ void print_nonzero_bytes(void *ptr, size_t size)
 
for (i = 0; i < size; i++)
putchar(p[i]);
-   printf("\n");
 }
 
 void validate_buffer(void *line, size_t size)
@@ -120,7 +119,7 @@ void validate_buffer(void *line, size_t size)
char command[256];  \
memset(command, 0, sizeof(command));\
snprintf(command, sizeof(command), cmd, ##__VA_ARGS__); \
-   printf("Running: %s\n", command);   \
+   fprintf(stderr, "Running: %s\n", command);  
 \
system(command);\
})
 
@@ -128,22 +127,22 @@ static int reset_flow_steering(void)
 {
int ret = 0;
 
-   ret = run_command("sudo ethtool -K %s ntuple off", ifname);
+   ret = run_command("sudo ethtool -K %s ntuple off >&2", ifname);
if (ret)
return ret;
 
-   return run_command("sudo ethtool -K %s ntuple on", ifname);
+   return run_command("sudo ethtool -K %s ntuple on >&2", ifname);
 }
 
 static int configure_headersplit(bool on)
 {
-   return run_command("sudo ethtool -G %s tcp-data-split %s", ifname,
+   return run_command("sudo ethtool -G %s tcp-data-split %s >&2", ifname,
   on ? "on" : "off");
 }
 
 static int configure_rss(void)
 {
-   return run_command("sudo ethtool -X %s equal %d", ifname, start_queue);
+   return run_command("sudo ethtool -X %s equal %d >&2", ifname, 
start_queue);
 }
 
 static int configure_channels(unsigned int rx, unsigned int tx)
@@ -153,7 +152,7 @@ static int configure_channels(unsigned int rx, unsigned int 
tx)
 
 static int configure_flow_steering(void)
 {
-   return run_command("sudo ethtool -N %s flow-type tcp4 src-ip %s dst-ip 
%s src-port %s dst-port %s queue %d",
+   return run_command("sudo ethtool -N %s flow-type tcp4 src-ip %s dst-ip 
%s src-port %s dst-port %s queue %d >&2",
   ifname, client_ip, server_ip, port, port, 
start_queue);
 }
 
@@ -187,7 +186,7 @@ static int bind_rx_queue(unsigned int ifindex, unsigned int 
dmabuf_fd,
goto err_close;
}
 
-   printf("got dmabuf id=%d\n", rsp->id);
+   fprintf(stderr, "got dmabuf id=%d\n", rsp->id);
dmabuf_id = rsp->id;
 
netdev_bind_rx_req_free(req);
@@ -314,8 +313,8 @@ int do_server(void)
if (ret)
error(errno, errno, "%s: [FAIL, set sock opt]\n", TEST_PREFIX);
 
-   printf("binding to address %s:%d\n", server_ip,
-  ntohs(server_sin.sin_port));
+   fprintf(stderr, "binding to address %s:%d\n", server_ip,
+   ntohs(server_sin.sin_port));
 
ret = bind(socket_fd, &server_sin, sizeof(server_sin));
if (ret)
@@ -329,14 +328,14 @@ int do_server(void)
 
inet_ntop(server_sin.sin_family, &server_sin.sin_addr, buffer,
  sizeof(buffer));
-   printf("Waiting or connection on %s:%d\n", buffer,
-  ntohs(server_sin.sin_port));
+   fprintf(stderr, "Waiting or connection on %s:%d\n", buffer,
+   ntohs(server_sin.sin_port));
client_fd = accept(socket_fd, &client_addr, &client_addr_len);
 
inet_ntop(client_addr.sin_family, &client_addr.sin_addr, buffer,
  sizeof(buffer));
-   printf("Got connection from %s:%d\n", buffer,
-  ntohs(client_addr.sin_port));
+   fprintf(stderr, "Got connection from %s:%d\n", buffer,
+   ntohs(client_addr.sin_port));
 
while (1) {
struct iovec iov = { .iov_base = iobuf,
@@ -349,14 +348,13 @@ int do_server(void)
ssize_t ret;
 
is_devmem = false;
-   printf("\n\n");
 
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
msg.msg_control = ctrl_data;
msg.msg_controllen = sizeof(ctrl_data);
ret = recvmsg(client_fd, &msg, MSG_SOCK_DEVMEM);
-   printf("recvmsg ret=%ld\n", ret);
+   fprintf(stderr, "recvmsg ret=%ld\n", ret);
if (ret < 0 && (errno == EAGAIN || errno == EWOULDBLOCK))
continue;
if (ret < 0) {
@@ -364,7 +362,7 @@ int do_server(void)
continue;
}
if (ret == 0) {
-   printf("cl

[PATCH net-next v5 02/12] selftests: ncdevmem: Separate out dmabuf provider

2024-10-23 Thread Stanislav Fomichev

So we can plug the other ones in the future if needed.

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/net/ncdevmem.c | 198 +++--
 1 file changed, 117 insertions(+), 81 deletions(-)

diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
index 9245d3f158dd..9b3ca6398a9d 100644
--- a/tools/testing/selftests/net/ncdevmem.c
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -71,17 +71,101 @@ static char *ifname = "eth1";
 static unsigned int ifindex;
 static unsigned int dmabuf_id;
 
-void print_bytes(void *ptr, size_t size)
+struct memory_buffer {
+   int fd;
+   size_t size;
+
+   int devfd;
+   int memfd;
+   char *buf_mem;
+};
+
+struct memory_provider {
+   struct memory_buffer *(*alloc)(size_t size);
+   void (*free)(struct memory_buffer *ctx);
+   void (*memcpy_from_device)(void *dst, struct memory_buffer *src,
+  size_t off, int n);
+};
+
+static struct memory_buffer *udmabuf_alloc(size_t size)
 {
-   unsigned char *p = ptr;
-   int i;
+   struct udmabuf_create create;
+   struct memory_buffer *ctx;
+   int ret;
 
-   for (i = 0; i < size; i++)
-   printf("%02hhX ", p[i]);
-   printf("\n");
+   ctx = malloc(sizeof(*ctx));
+   if (!ctx)
+   error(1, ENOMEM, "malloc failed");
+
+   ctx->size = size;
+
+   ctx->devfd = open("/dev/udmabuf", O_RDWR);
+   if (ctx->devfd < 0)
+   error(1, errno,
+ "%s: [skip,no-udmabuf: Unable to access DMA buffer device 
file]\n",
+ TEST_PREFIX);
+
+   ctx->memfd = memfd_create("udmabuf-test", MFD_ALLOW_SEALING);
+   if (ctx->memfd < 0)
+   error(1, errno, "%s: [skip,no-memfd]\n", TEST_PREFIX);
+
+   ret = fcntl(ctx->memfd, F_ADD_SEALS, F_SEAL_SHRINK);
+   if (ret < 0)
+   error(1, errno, "%s: [skip,fcntl-add-seals]\n", TEST_PREFIX);
+
+   ret = ftruncate(ctx->memfd, size);
+   if (ret == -1)
+   error(1, errno, "%s: [FAIL,memfd-truncate]\n", TEST_PREFIX);
+
+   memset(&create, 0, sizeof(create));
+
+   create.memfd = ctx->memfd;
+   create.offset = 0;
+   create.size = size;
+   ctx->fd = ioctl(ctx->devfd, UDMABUF_CREATE, &create);
+   if (ctx->fd < 0)
+   error(1, errno, "%s: [FAIL, create udmabuf]\n", TEST_PREFIX);
+
+   ctx->buf_mem = mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_SHARED,
+   ctx->fd, 0);
+   if (ctx->buf_mem == MAP_FAILED)
+   error(1, errno, "%s: [FAIL, map udmabuf]\n", TEST_PREFIX);
+
+   return ctx;
+}
+
+static void udmabuf_free(struct memory_buffer *ctx)
+{
+   munmap(ctx->buf_mem, ctx->size);
+   close(ctx->fd);
+   close(ctx->memfd);
+   close(ctx->devfd);
+   free(ctx);
 }
 
-void print_nonzero_bytes(void *ptr, size_t size)
+static void udmabuf_memcpy_from_device(void *dst, struct memory_buffer *src,
+  size_t off, int n)
+{
+   struct dma_buf_sync sync = {};
+
+   sync.flags = DMA_BUF_SYNC_START;
+   ioctl(src->fd, DMA_BUF_IOCTL_SYNC, &sync);
+
+   memcpy(dst, src->buf_mem + off, n);
+
+   sync.flags = DMA_BUF_SYNC_END;
+   ioctl(src->fd, DMA_BUF_IOCTL_SYNC, &sync);
+}
+
+static struct memory_provider udmabuf_memory_provider = {
+   .alloc = udmabuf_alloc,
+   .free = udmabuf_free,
+   .memcpy_from_device = udmabuf_memcpy_from_device,
+};
+
+static struct memory_provider *provider = &udmabuf_memory_provider;
+
+static void print_nonzero_bytes(void *ptr, size_t size)
 {
unsigned char *p = ptr;
unsigned int i;
@@ -201,42 +285,7 @@ static int bind_rx_queue(unsigned int ifindex, unsigned 
int dmabuf_fd,
return -1;
 }
 
-static void create_udmabuf(int *devfd, int *memfd, int *buf, size_t 
dmabuf_size)
-{
-   struct udmabuf_create create;
-   int ret;
-
-   *devfd = open("/dev/udmabuf", O_RDWR);
-   if (*devfd < 0) {
-   error(70, 0,
- "%s: [skip,no-udmabuf: Unable to access DMA buffer device 
file]\n",
- TEST_PREFIX);
-   }
-
-   *memfd = memfd_create("udmabuf-test", MFD_ALLOW_SEALING);
-   if (*memfd < 0)
-   error(70, 0, "%s: [skip,no-memfd]\n", TEST_PREFIX);
-
-   /* Required for udmabuf */
-   ret = fcntl(*memfd, F_ADD_SEALS, F_SEAL_SHRINK);
-   if (ret < 0)
-   error(73, 0, "%s: [skip,fcntl-add-seals]\n", TEST_PREFIX);
-
-   ret = ftruncate(*memfd, dmabuf_size);
-   if (ret == -1)
-   error(74, 0, "%s: [FAIL,memfd-truncate]\n", TEST_PREFIX);
-
-   memset(&create, 0, sizeof(create));
-
-   create.memfd = *memfd;
-   create.offset = 0;
-   create.size = dmabuf_size;
-   *buf = ioctl(*devfd, UDMABUF_CREATE, &create);
-   if (*buf < 0)
-   err

[PATCH net-next v5 03/12] selftests: ncdevmem: Unify error handling

2024-10-23 Thread Stanislav Fomichev

There is a bunch of places where error() calls look out of place.
Use the same error(1, errno, ...) pattern everywhere.

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/net/ncdevmem.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
index 9b3ca6398a9d..07222bfcdb07 100644
--- a/tools/testing/selftests/net/ncdevmem.c
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -339,33 +339,33 @@ int do_server(struct memory_buffer *mem)
server_sin.sin_port = htons(atoi(port));
 
ret = inet_pton(server_sin.sin_family, server_ip, &server_sin.sin_addr);
-   if (socket < 0)
-   error(79, 0, "%s: [FAIL, create socket]\n", TEST_PREFIX);
+   if (ret < 0)
+   error(1, pton, "%s: [FAIL, create socket]\n", TEST_PREFIX);
 
socket_fd = socket(server_sin.sin_family, SOCK_STREAM, 0);
-   if (socket < 0)
-   error(errno, errno, "%s: [FAIL, create socket]\n", TEST_PREFIX);
+   if (socket_fd < 0)
+   error(1, errno, "%s: [FAIL, create socket]\n", TEST_PREFIX);
 
ret = setsockopt(socket_fd, SOL_SOCKET, SO_REUSEPORT, &opt,
 sizeof(opt));
if (ret)
-   error(errno, errno, "%s: [FAIL, set sock opt]\n", TEST_PREFIX);
+   error(1, errno, "%s: [FAIL, set sock opt]\n", TEST_PREFIX);
 
ret = setsockopt(socket_fd, SOL_SOCKET, SO_REUSEADDR, &opt,
 sizeof(opt));
if (ret)
-   error(errno, errno, "%s: [FAIL, set sock opt]\n", TEST_PREFIX);
+   error(1, errno, "%s: [FAIL, set sock opt]\n", TEST_PREFIX);
 
fprintf(stderr, "binding to address %s:%d\n", server_ip,
ntohs(server_sin.sin_port));
 
ret = bind(socket_fd, &server_sin, sizeof(server_sin));
if (ret)
-   error(errno, errno, "%s: [FAIL, bind]\n", TEST_PREFIX);
+   error(1, errno, "%s: [FAIL, bind]\n", TEST_PREFIX);
 
ret = listen(socket_fd, 1);
if (ret)
-   error(errno, errno, "%s: [FAIL, listen]\n", TEST_PREFIX);
+   error(1, errno, "%s: [FAIL, listen]\n", TEST_PREFIX);
 
client_addr_len = sizeof(client_addr);
 
-- 
2.47.0

[PATCH net-next v6 2/3] connector/cn_proc: Kunit tests for threads hash table

2024-10-23 Thread Anjali Kulkarni

Kunit tests to test hash table add, delete, duplicate add and delete.
Add following configs and compile kernel code:

CONFIG_CONNECTOR=y
CONFIG_PROC_EVENTS=y
CONFIG_NET=y
CONFIG_KUNIT=m
CONFIG_CN_HASH_KUNIT_TEST=m

To run kunit tests:
sudo modprobe cn_hash_test

Output of kunit tests and hash table contents are displayed in
/var/log/messages (at KERN_DEBUG level).

Signed-off-by: Anjali Kulkarni 
---
 drivers/connector/cn_hash.c  |  40 +
 drivers/connector/connector.c|  12 ++
 include/linux/connector.h|   4 +
 lib/Kconfig.debug|  19 +++
 lib/Makefile |   1 +
 lib/cn_hash_test.c   | 169 +++
 lib/cn_hash_test.h   |  10 ++
 tools/testing/kunit/configs/all_tests.config |   5 +
 8 files changed, 260 insertions(+)
 create mode 100644 lib/cn_hash_test.c
 create mode 100644 lib/cn_hash_test.h

diff --git a/drivers/connector/cn_hash.c b/drivers/connector/cn_hash.c
index b94f6c461496..a8b8760c2e8c 100644
--- a/drivers/connector/cn_hash.c
+++ b/drivers/connector/cn_hash.c
@@ -165,6 +165,46 @@ int cn_hash_get_exval(struct cn_hash_dev *hdev, pid_t pid)
return -EINVAL;
 }
 
+int cn_hash_display_hlist(struct cn_hash_dev *hdev, pid_t pid, int max_len,
+ int *hkey, int *key_display)
+{
+   struct uexit_pid_hnode *hnode;
+   int key, count = 0;
+
+   mutex_lock(&hdev->uexit_hash_lock);
+   key = hash_min(pid, HASH_BITS(hdev->uexit_pid_htable));
+   pr_debug("Bucket: %d\n", key);
+
+   hlist_for_each_entry(hnode,
+&hdev->uexit_pid_htable[key],
+uexit_pid_hlist) {
+   if (key_display[key] != 1) {
+   if (!hnode->uexit_pid_hlist.next)
+   pr_debug("pid %d ", hnode->pid);
+   else
+   pr_debug("pid %d --> ", hnode->pid);
+   }
+   count++;
+   }
+
+   mutex_unlock(&hdev->uexit_hash_lock);
+
+   if (key_display[key] != 1 && !count)
+   pr_debug("(empty)\n");
+
+   pr_debug("\n");
+
+   *hkey = key;
+
+   if (count > max_len) {
+   pr_err("%d entries in hlist for key %d, expected %d\n",
+  count, key, max_len);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 bool cn_hash_table_empty(struct cn_hash_dev *hdev)
 {
bool is_empty;
diff --git a/drivers/connector/connector.c b/drivers/connector/connector.c
index 2494a443fbd6..9b0cbe134e87 100644
--- a/drivers/connector/connector.c
+++ b/drivers/connector/connector.c
@@ -304,6 +304,18 @@ int cn_get_exval(pid_t pid)
 }
 EXPORT_SYMBOL_GPL(cn_get_exval);
 
+int cn_display_hlist(pid_t pid, int max_len, int *hkey, int *key_display)
+{
+   struct cn_dev *dev = &cdev;
+
+   if (!cn_already_initialized)
+   return 0;
+
+   return cn_hash_display_hlist(dev->hdev, pid, max_len,
+   hkey, key_display);
+}
+EXPORT_SYMBOL_GPL(cn_display_hlist);
+
 bool cn_table_empty(void)
 {
struct cn_dev *dev = &cdev;
diff --git a/include/linux/connector.h b/include/linux/connector.h
index 3f0648c886bc..393e1a391b70 100644
--- a/include/linux/connector.h
+++ b/include/linux/connector.h
@@ -173,4 +173,8 @@ int cn_get_exval(pid_t pid);
 bool cn_table_empty(void);
 bool cn_hash_table_empty(struct cn_hash_dev *hdev);
 
+int cn_display_hlist(pid_t pid, int max_len, int *hkey, int *key_display);
+int cn_hash_display_hlist(struct cn_hash_dev *hdev, pid_t pid, int max_len,
+ int *hkey, int *key_display);
+
 #endif /* __CONNECTOR_H */
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 7315f643817a..784bd66221ab 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -2705,6 +2705,25 @@ config HASHTABLE_KUNIT_TEST
 
  If unsure, say N.
 
+config CN_HASH_KUNIT_TEST
+   tristate "KUnit Test for connector hashtable code" if !KUNIT_ALL_TESTS
+   depends on KUNIT
+   depends on CONNECTOR && PROC_EVENTS
+   depends on NET
+   default KUNIT_ALL_TESTS
+   help
+ This builds the hashtable KUnit test suite.
+ It tests the basic functionality of the API defined in
+ drivers/connector/cn_hash.c.
+ CONFIG_CONNECTOR=y, CONFIG_PROC_EVENTS=y and CONFIG_NET=y needs
+ to be enabled along with CONFIG_CN_HASH_KUNIT_TEST=m and
+ CONFIG_KUNIT=m in .config file to compile and then test as a
+ kernel module with "modprobe cn_hash_test".
+ For more information on KUnit and unit tests in general please
+ refer to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
+
 config LINEAR_RANGES_TEST
tristate "KUnit test for linear_ranges"
depends on KUNIT
diff --git a/lib

[PATCH 4/4] arm64: dts: imx: add imx95 dts for sof

2024-10-23 Thread Laurentiu Mihalcea

From: Laurentiu Mihalcea 

Add imx95 DTS for SOF usage.

Signed-off-by: Laurentiu Mihalcea 
---
 arch/arm64/boot/dts/freescale/Makefile|  1 +
 .../dts/freescale/imx95-19x19-evk-sof.dts | 86 +++
 2 files changed, 87 insertions(+)
 create mode 100644 arch/arm64/boot/dts/freescale/imx95-19x19-evk-sof.dts

diff --git a/arch/arm64/boot/dts/freescale/Makefile 
b/arch/arm64/boot/dts/freescale/Makefile
index 2a69b7ec6d6d..94660e3e8b2b 100644
--- a/arch/arm64/boot/dts/freescale/Makefile
+++ b/arch/arm64/boot/dts/freescale/Makefile
@@ -267,6 +267,7 @@ dtb-$(CONFIG_ARCH_MXC) += imx93-tqma9352-mba93xxca.dtb
 dtb-$(CONFIG_ARCH_MXC) += imx93-tqma9352-mba93xxla.dtb
 dtb-$(CONFIG_ARCH_MXC) += imx93-var-som-symphony.dtb
 dtb-$(CONFIG_ARCH_MXC) += imx95-19x19-evk.dtb
+dtb-$(CONFIG_ARCH_MXC) += imx95-19x19-evk-sof.dtb
 
 imx8mm-kontron-dl-dtbs := imx8mm-kontron-bl.dtb 
imx8mm-kontron-dl.dtbo
 
diff --git a/arch/arm64/boot/dts/freescale/imx95-19x19-evk-sof.dts 
b/arch/arm64/boot/dts/freescale/imx95-19x19-evk-sof.dts
new file mode 100644
index ..b10dc1af5ce2
--- /dev/null
+++ b/arch/arm64/boot/dts/freescale/imx95-19x19-evk-sof.dts
@@ -0,0 +1,86 @@
+// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
+/*
+ * Copyright 2024 NXP
+ */
+
+/dts-v1/;
+
+#include "imx95-19x19-evk.dts"
+
+/ {
+   reserved-memory {
+   adma_res: memory@8610 {
+   compatible = "shared-dma-pool";
+   reg = <0x0 0x8610 0x0 0x10>;
+   no-map;
+   };
+   };
+
+   sound-wm8962 {
+   status = "disabled";
+   };
+
+   sof-sound-wm8962 {
+   compatible = "audio-graph-card2";
+
+   links = <&cpu>;
+   label = "wm8962-audio";
+
+   hp-det-gpios = <&gpio2 11 GPIO_ACTIVE_HIGH>;
+
+   pinctrl-names = "default";
+   pinctrl-0 = <&pinctrl_hp>;
+
+   widgets =
+   "Headphone", "Headphones",
+   "Microphone", "Headset Mic";
+   routing =
+   "Headphones", "HPOUTL",
+   "Headphones", "HPOUTR",
+   "Headset Mic", "MICBIAS",
+   "IN3R", "Headset Mic",
+   "IN1R", "Headset Mic";
+   };
+
+   sof_cpu: cm7-cpu@8000 {
+   compatible = "fsl,imx95-cm7-sof";
+
+   reg = <0x0 0x8000 0x0 0x40>,
+ <0x0 0x8600 0x0 0x3000>;
+   reg-names = "dram", "mailbox";
+
+   memory-region = <&adma_res>;
+
+   mboxes = <&mu7 2 0>, <&mu7 2 1>, <&mu7 3 0>, <&mu7 3 1>;
+   mbox-names = "txdb0", "txdb1", "rxdb0", "rxdb1";
+
+   cpu: port {
+   cpu_ep: endpoint { remote-endpoint = <&codec_ep>; };
+   };
+   };
+};
+
+&wm8962 {
+   assigned-clocks = <&scmi_clk IMX95_CLK_AUDIOPLL1_VCO>,
+ <&scmi_clk IMX95_CLK_AUDIOPLL2_VCO>,
+ <&scmi_clk IMX95_CLK_AUDIOPLL1>,
+ <&scmi_clk IMX95_CLK_AUDIOPLL2>,
+ <&scmi_clk IMX95_CLK_SAI3>;
+   assigned-clock-parents = <0>, <0>, <0>, <0>, <&scmi_clk 
IMX95_CLK_AUDIOPLL1>;
+   assigned-clock-rates = <393216>, <3612672000>,
+  <393216000>, <361267200>,
+  <12288000>;
+   status = "okay";
+
+   port {
+   codec_ep: endpoint { remote-endpoint = <&cpu_ep>; };
+   };
+};
+
+&edma2 {
+   dma-channel-mask = <0x3fff>, <0x>;
+};
+
+&sai3 {
+   status = "disabled";
+};
-- 
2.34.1

[PATCH 3/4] ASoC: SOF: imx: add driver for imx95

2024-10-23 Thread Laurentiu Mihalcea

From: Laurentiu Mihalcea 

Add SOF driver for imx95.

Signed-off-by: Laurentiu Mihalcea 
---
 sound/soc/sof/imx/Kconfig  |   8 +
 sound/soc/sof/imx/Makefile |   2 +
 sound/soc/sof/imx/imx95.c  | 383 +
 3 files changed, 393 insertions(+)
 create mode 100644 sound/soc/sof/imx/imx95.c

diff --git a/sound/soc/sof/imx/Kconfig b/sound/soc/sof/imx/Kconfig
index 4751b04d5e6f..51a70a193533 100644
--- a/sound/soc/sof/imx/Kconfig
+++ b/sound/soc/sof/imx/Kconfig
@@ -50,4 +50,12 @@ config SND_SOC_SOF_IMX8ULP
  Say Y if you have such a device.
  If unsure select "N".
 
+config SND_SOC_SOF_IMX95
+   tristate "SOF support for i.MX95"
+   depends on IMX_DSP
+   help
+ This adds support for Sound Open Firmware for NXP i.MX95 platforms.
+ Say Y if you have such a device.
+ If unsure select "N".
+
 endif ## SND_SOC_SOF_IMX_TOPLEVEL
diff --git a/sound/soc/sof/imx/Makefile b/sound/soc/sof/imx/Makefile
index be0bf0736dfa..715ac3798668 100644
--- a/sound/soc/sof/imx/Makefile
+++ b/sound/soc/sof/imx/Makefile
@@ -2,10 +2,12 @@
 snd-sof-imx8-y := imx8.o
 snd-sof-imx8m-y := imx8m.o
 snd-sof-imx8ulp-y := imx8ulp.o
+snd-sof-imx95-y := imx95.o
 
 snd-sof-imx-common-y := imx-common.o
 
 obj-$(CONFIG_SND_SOC_SOF_IMX8) += snd-sof-imx8.o
 obj-$(CONFIG_SND_SOC_SOF_IMX8M) += snd-sof-imx8m.o
 obj-$(CONFIG_SND_SOC_SOF_IMX8ULP) += snd-sof-imx8ulp.o
+obj-$(CONFIG_SND_SOC_SOF_IMX95) += snd-sof-imx95.o
 obj-$(CONFIG_SND_SOC_SOF_IMX_COMMON) += imx-common.o
diff --git a/sound/soc/sof/imx/imx95.c b/sound/soc/sof/imx/imx95.c
new file mode 100644
index ..3f7ed6a16c42
--- /dev/null
+++ b/sound/soc/sof/imx/imx95.c
@@ -0,0 +1,383 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
+/*
+ * Copyright 2024 NXP
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "../sof-of-dev.h"
+#include "../ops.h"
+
+#define IMX_SIP_SRC 0xC205
+#define IMX_SIP_SRC_M_RESET_ADDR_SET 0x03
+
+#define IMX95_CPU_VEC_FLAGS_BOOT BIT(29)
+
+#define IMX_SIP_LMM 0xC20F
+#define IMX_SIP_LMM_BOOT 0x0
+#define IMX_SIP_LMM_SHUTDOWN 0x1
+
+#define IMX95_M7_LM_ID 0x1
+
+#define MBOX_DSPBOX_OFFSET 0x1000
+
+struct imx95_priv {
+   struct platform_device *ipc_dev;
+   struct imx_dsp_ipc *ipc_handle;
+   resource_size_t bootaddr;
+};
+
+static void imx95_ipc_handle_reply(struct imx_dsp_ipc *ipc)
+{
+   unsigned long flags;
+   struct snd_sof_dev *sdev = imx_dsp_get_data(ipc);
+
+   spin_lock_irqsave(&sdev->ipc_lock, flags);
+   snd_sof_ipc_process_reply(sdev, 0);
+   spin_unlock_irqrestore(&sdev->ipc_lock, flags);
+}
+
+static void imx95_ipc_handle_request(struct imx_dsp_ipc *ipc)
+{
+   snd_sof_ipc_msgs_rx(imx_dsp_get_data(ipc));
+}
+
+static struct imx_dsp_ops ipc_ops = {
+   .handle_reply = imx95_ipc_handle_reply,
+   .handle_request = imx95_ipc_handle_request,
+};
+
+static int imx95_disable_enable_core(bool enable)
+{
+   struct arm_smccc_res res;
+
+   if (enable)
+   arm_smccc_smc(IMX_SIP_LMM, IMX_SIP_LMM_BOOT, IMX95_M7_LM_ID,
+ 0, 0, 0, 0, 0, &res);
+   else
+   arm_smccc_smc(IMX_SIP_LMM, IMX_SIP_LMM_SHUTDOWN, IMX95_M7_LM_ID,
+ 0, 0, 0, 0, 0, &res);
+
+   return res.a0;
+}
+
+static int imx95_run(struct snd_sof_dev *sdev)
+{
+   return imx95_disable_enable_core(true);
+}
+
+static int imx95_send_msg(struct snd_sof_dev *sdev, struct snd_sof_ipc_msg 
*msg)
+{
+   struct imx95_priv *priv = sdev->pdata->hw_pdata;
+
+   sof_mailbox_write(sdev, sdev->host_box.offset,
+ msg->msg_data, msg->msg_size);
+
+   imx_dsp_ring_doorbell(priv->ipc_handle, 0);
+
+   return 0;
+}
+
+static int imx95_get_mailbox_offset(struct snd_sof_dev *sdev)
+{
+   return MBOX_DSPBOX_OFFSET;
+}
+
+static int imx95_get_bar_index(struct snd_sof_dev *sdev, u32 type)
+{
+   switch (type) {
+   case SOF_FW_BLK_TYPE_SRAM:
+   case SOF_FW_BLK_TYPE_DRAM:
+   return type;
+   default:
+   return -EINVAL;
+   }
+}
+
+static int imx95_get_window_offset(struct snd_sof_dev *sdev, u32 id)
+{
+   /* no offset for window regions - they are already relative
+* to the mailbox memory region described in DT.
+*/
+   return 0;
+}
+
+static int imx95_set_power_state(struct snd_sof_dev *sdev,
+const struct sof_dsp_power_state *target_state)
+{
+   sdev->dsp_power_state = *target_state;
+
+   return 0;
+}
+
+/* no other resources to power on during (runtime) resume so these functions
+ * don't do any resource management (i.e: clks, PDs, etc...). As for enabling
+ * the M7 LM: this is taken care of by run(), which is called during 
sof_resume().
+ */
+static int imx95_runtime_resume(struct snd_sof_dev *sdev)
+{
+   const struct sof_dsp_power_state target_state = {
+   .state =

[PATCH 2/4] ASoC: dt-bindings: audio-graph-card2: add widgets and hp-det-gpios support

2024-10-23 Thread Laurentiu Mihalcea

From: Laurentiu Mihalcea 

Introduce the 'widgets' property, allowing the creation of widgets from
4 template widgets: Microphone, Line, Headphone, and Speaker. Also
introduce the 'hp-det-gpios' property, which allows using headphone
detection using the specified GPIO.

Signed-off-by: Laurentiu Mihalcea 
---
 .../devicetree/bindings/sound/audio-graph-card2.yaml | 9 +
 1 file changed, 9 insertions(+)

diff --git a/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml 
b/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml
index f943f90d8b15..f0300a08f7fe 100644
--- a/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml
+++ b/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml
@@ -37,6 +37,15 @@ properties:
   codec2codec:
 type: object
 description: Codec to Codec node
+  hp-det-gpios:
+maxItems: 1
+  widgets:
+description:
+  User specified audio sound widgets.
+  Each entry is a pair of strings, the first being the type of
+  widget ("Microphone", "Line", "Headphone", "Speaker"), the
+  second being the machine specific name for the widget.
+$ref: /schemas/types.yaml#/definitions/non-unique-string-array
 
 required:
   - compatible
-- 
2.34.1

[PATCH 0/4] add sof support on imx95

2024-10-23 Thread Laurentiu Mihalcea

From: Laurentiu Mihalcea 

Add sof support on imx95. This series also includes some changes to
the audio-graph-card2 binding required for the support.

Laurentiu Mihalcea (4):
  dt-bindings: remoteproc: fsl,imx-rproc: add new compatible
  ASoC: dt-bindings: audio-graph-card2: add widgets and hp-det-gpios
support
  ASoC: SOF: imx: add driver for imx95
  arm64: dts: imx: add imx95 dts for sof

 .../bindings/remoteproc/fsl,imx-rproc.yaml|  58 ++-
 .../bindings/sound/audio-graph-card2.yaml |   9 +
 arch/arm64/boot/dts/freescale/Makefile|   1 +
 .../dts/freescale/imx95-19x19-evk-sof.dts |  86 
 sound/soc/sof/imx/Kconfig |   8 +
 sound/soc/sof/imx/Makefile|   2 +
 sound/soc/sof/imx/imx95.c | 383 ++
 7 files changed, 542 insertions(+), 5 deletions(-)
 create mode 100644 arch/arm64/boot/dts/freescale/imx95-19x19-evk-sof.dts
 create mode 100644 sound/soc/sof/imx/imx95.c

-- 
2.34.1

Re: [PATCH rcu] srcu: Guarantee non-negative return value from srcu_read_lock()

2024-10-23 Thread Andrii Nakryiko

On Tue, Oct 22, 2024 at 11:40 PM Christoph Hellwig  wrote:
>
> On Tue, Oct 22, 2024 at 10:29:13AM -0700, Andrii Nakryiko wrote:
> > >
> > > Would this work?
> > >
> > > #define SRCU_INVALID_INDEX -1
> > >
> >
> > But why?
>
> Becaue it very clearly documents what is going on.

So does saying "returned value is going to be non-negative, always".
It's not some weird and unusual thing in C by any means.

>
> >It's a nice property to have an int-returning API where valid
> > values are only >= 0, so callers are free to use the entire negative
> > range (not just -1) for whatever they need to store in case there is
> > no srcu_idx value.
>
> Well, if you have a concrete use case for that we can probably live
> with it, but I'd rather have that use case extremely well documented,
> as it will be very puzzling to the reader.
>

See [0] for what Peter is proposing. Note hprobe_init() and its use of
`srcu_idx < 0` as a mark that there is no SRCU "session" associated
with the uprobe. Now, if we say that SRCU_INVALID_INDEX is the only
invalid value, it leaves a question: what to do with other negative
srcu_idx values? Are they valid? Should I now WARN() on "unknown
invalid" values? Why all these complications? I'd rather just not have
a unified hprobe_init() at that point, TBH.

But anyways, taking a step back. Take idr_alloc()/idr_alloc_cyclic()
APIs as an example. They return int, but valid IDs are documented to
be positive. This leaves users of this API free to use int to store ID
in their data structures, knowing that <= 0 is "no ID", and if
necessary, they can have multiple possible "no ID" situations.

Same principle here. Why prescribing a randomly picked -1 as the only
"valid" invalid value, if anything negative is clearly impossible.

  [0] 
https://lore.kernel.org/linux-trace-kernel/20241018101647.ga36...@noisy.programming.kicks-ass.net/

Re: [PATCH] selftests: livepatch: add test cases of stack_order sysfs interface

2024-10-23 Thread zhang warden



What's more, I have open a pull request of the user space tool kpatch[1] 
to use this new kernel sysfs attribute 'stack_order'.

Maintainers who feel interesting in this function can come for a look.

Best Regards.
Wardenjohn.

[1]: https://github.com/dynup/kpatch/pull/1419

Re: [PATCH net-next v5 2/3] connector/cn_proc: Kunit tests for threads hash table

2024-10-23 Thread Stanislav Fomichev

On 10/23, Anjali Kulkarni wrote:
> […snip…]
> 
>  
>  Yes, make sure all required options are picked up by
>  "./tools/testing/kunit/kunit.py run" instead of manually adding options
>  and doing modprobe.
> >>> 
> >>> The environment issues are resolved and I am able to run kunit.py, but my 
> >>> tests
> >>> are not invoked without giving options via —kconfig-add. Other tests are 
> >>> also not
> >>> invoked. Running with the manual options runs 413 tests, and with just 
> >>> kunit.py
> >>> runs 389 tests. (I have added 6). Any idea how I can make it run my tests?
> >> 
> >> The runner does: ./tools/testing/kunit/kunit.py run --alltests
> >> Is it not enough in your case? What options do you pass via
> >> --kconfig-add? Is it because CONNECTOR stuff is disabled by default?
> > 
> > No, it still does not run.
> > However, I added to tools/testing/kunit/configs/all_tests.config:
> > 
> > CONFIG_CONNECTOR=y
> > CONFIG_PROC_EVENTS=y
> > CONFIG_NET=y
> > CONFIG_CN_HASH_KUNIT_TEST=y
> > 
> > And now it does run.
> > Should I make the change above? I will also check with the kunit guys.
> > But I do not understand how it ran for you(and run into the error), or did
> > it just try to compile?
> 
> I see this in comments on top of all_tests.config.
> 
> # The config is manually maintained, though it uses KUNIT_ALL_TESTS=y to 
> enable
> # any tests whose dependencies are already satisfied. Please feel free to add
> # more options if they any new tests.
> 
> So I suppose if a test needs more dependencies, it needs to be added here.

Let's try and CC a bunch of kunit people to confirm :-)

Re: [PATCH] rcu/kvfree: Fix data-race in __mod_timer / kvfree_call_rcu

2024-10-23 Thread Frederic Weisbecker

Le Tue, Oct 22, 2024 at 12:53:07PM +0200, Uladzislau Rezki (Sony) a écrit :
> KCSAN reports a data race when access the krcp->monitor_work.timer.expires
> variable in the schedule_delayed_monitor_work() function:
> 
> 
> BUG: KCSAN: data-race in __mod_timer / kvfree_call_rcu
> 
> read to 0x888237d1cce8 of 8 bytes by task 10149 on cpu 1:
>  schedule_delayed_monitor_work kernel/rcu/tree.c:3520 [inline]
>  kvfree_call_rcu+0x3b8/0x510 kernel/rcu/tree.c:3839
>  trie_update_elem+0x47c/0x620 kernel/bpf/lpm_trie.c:441
>  bpf_map_update_value+0x324/0x350 kernel/bpf/syscall.c:203
>  generic_map_update_batch+0x401/0x520 kernel/bpf/syscall.c:1849
>  bpf_map_do_batch+0x28c/0x3f0 kernel/bpf/syscall.c:5143
>  __sys_bpf+0x2e5/0x7a0
>  __do_sys_bpf kernel/bpf/syscall.c:5741 [inline]
>  __se_sys_bpf kernel/bpf/syscall.c:5739 [inline]
>  __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5739
>  x64_sys_call+0x2625/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:322
>  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
>  do_syscall_64+0xc9/0x1c0 arch/x86/entry/common.c:83
>  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> write to 0x888237d1cce8 of 8 bytes by task 56 on cpu 0:
>  __mod_timer+0x578/0x7f0 kernel/time/timer.c:1173
>  add_timer_global+0x51/0x70 kernel/time/timer.c:1330
>  __queue_delayed_work+0x127/0x1a0 kernel/workqueue.c:2523
>  queue_delayed_work_on+0xdf/0x190 kernel/workqueue.c:2552
>  queue_delayed_work include/linux/workqueue.h:677 [inline]
>  schedule_delayed_monitor_work kernel/rcu/tree.c:3525 [inline]
>  kfree_rcu_monitor+0x5e8/0x660 kernel/rcu/tree.c:3643
>  process_one_work kernel/workqueue.c:3229 [inline]
>  process_scheduled_works+0x483/0x9a0 kernel/workqueue.c:3310
>  worker_thread+0x51d/0x6f0 kernel/workqueue.c:3391
>  kthread+0x1d1/0x210 kernel/kthread.c:389
>  ret_from_fork+0x4b/0x60 arch/x86/kernel/process.c:147
>  ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 UID: 0 PID: 56 Comm: kworker/u8:4 Not tainted 
> 6.12.0-rc2-syzkaller-00050-g5b7c893ed5ed #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 09/13/2024
> Workqueue: events_unbound kfree_rcu_monitor
> 
> 
> kfree_rcu_monitor() rearms the work if a "krcp" has to be still
> offloaded and this is done without holding krcp->lock, whereas
> the kvfree_call_rcu() holds it.
> 
> Fix it by acquiring the "krcp->lock" for kfree_rcu_monitor() so
> both functions do not race anymore.
> 
> Reported-by: syzbot+061d370693bdd99f9...@syzkaller.appspotmail.com
> Link: https://lore.kernel.org/lkml/ZxZ68KmHDQYU0yfD@pc636/T/
> Fixes: 8fc5494ad5fa ("rcu/kvfree: Move need_offload_krc() out of krcp->lock")
> Signed-off-by: Uladzislau Rezki (Sony) 

Applied, thanks!

Re: [PATCH net-next v5 2/3] connector/cn_proc: Kunit tests for threads hash table

2024-10-23 Thread Anjali Kulkarni



> On Oct 23, 2024, at 8:05 AM, Stanislav Fomichev  wrote:
> 
> On 10/23, Anjali Kulkarni wrote:
>> […snip…]
>> 
>> 
>> Yes, make sure all required options are picked up by
>> "./tools/testing/kunit/kunit.py run" instead of manually adding options
>> and doing modprobe.
> 
> The environment issues are resolved and I am able to run kunit.py, but my 
> tests
> are not invoked without giving options via —kconfig-add. Other tests are 
> also not
> invoked. Running with the manual options runs 413 tests, and with just 
> kunit.py
> runs 389 tests. (I have added 6). Any idea how I can make it run my tests?
 
 The runner does: ./tools/testing/kunit/kunit.py run --alltests
 Is it not enough in your case? What options do you pass via
 --kconfig-add? Is it because CONNECTOR stuff is disabled by default?
>>> 
>>> No, it still does not run.
>>> However, I added to tools/testing/kunit/configs/all_tests.config:
>>> 
>>> CONFIG_CONNECTOR=y
>>> CONFIG_PROC_EVENTS=y
>>> CONFIG_NET=y
>>> CONFIG_CN_HASH_KUNIT_TEST=y
>>> 
>>> And now it does run.
>>> Should I make the change above? I will also check with the kunit guys.
>>> But I do not understand how it ran for you(and run into the error), or did
>>> it just try to compile?
>> 
>> I see this in comments on top of all_tests.config.
>> 
>> # The config is manually maintained, though it uses KUNIT_ALL_TESTS=y to 
>> enable
>> # any tests whose dependencies are already satisfied. Please feel free to add
>> # more options if they any new tests.
>> 
>> So I suppose if a test needs more dependencies, it needs to be added here.
> 
> Let's try and CC a bunch of kunit people to confirm :-)

Ok! Will send out a new patch and cc the kunit folks on it. Hopefully it works:)

[PATCH net-next v6 1/3] connector/cn_proc: Add hash table for threads

2024-10-23 Thread Anjali Kulkarni

Add a new type PROC_CN_MCAST_NOTIFY to proc connector API, which allows a
thread to notify the kernel that is going to exit with a non-zero exit
code and specify the exit code in it. When thread exits in the kernel,
it will send this exit code as a proc filter notification to any
listening process.
Exiting thread can call this either when it wants to call pthread_exit()
with non-zero value or from signal handler.

Add a new file cn_hash.c which implements a hash table storing the exit
codes of abnormally exiting threads, received by the system call above.
The key used for the hash table is the pid of the thread, so when the
thread actually exits, we lookup it's pid in the hash table and retrieve
the exit code sent by user. If the exit code in struct task is 0, we
then replace it with the user supplied non-zero exit code.

cn_hash.c implements the hash table add, delete, lookup operations.
mutex_lock() and mutex_unlock() operations are used to safeguard the
integrity of the hash table while adding or deleting elements.
connector.c has the API calls, called from cn_proc.c, as well as calls
to allocate, initialize and free the hash table.

Add a new flag in PF_* flags of task_struct - EXIT_NOTIFY. This flag is
set when user sends the exit code via PROC_CN_MCAST_NOTIFY. While
exiting, this flag is checked and the hash table add or delete calls
are only made if this flag is set.

A refcount field hrefcnt is added in struct cn_hash_dev, to keep track
of number of threads which have added an entry in hash table. Before
freeing the struct cn_hash_dev, this value must be 0.
This refcnt check is added in case CONFIG_CONNECTOR is compiled as a
module. In that case, when unloading the module, we need to make sure
no hash entries are still present in the hdev table.

Signed-off-by: Anjali Kulkarni 
---
 drivers/connector/Makefile|   2 +-
 drivers/connector/cn_hash.c   | 176 ++
 drivers/connector/cn_proc.c   |  81 +---
 drivers/connector/connector.c |  76 +--
 include/linux/connector.h |  58 ---
 include/linux/sched.h |   2 +-
 include/uapi/linux/cn_proc.h  |   4 +-
 7 files changed, 364 insertions(+), 35 deletions(-)
 create mode 100644 drivers/connector/cn_hash.c

diff --git a/drivers/connector/Makefile b/drivers/connector/Makefile
index 1bf67d3df97d..cb1dcdf067ad 100644
--- a/drivers/connector/Makefile
+++ b/drivers/connector/Makefile
@@ -2,4 +2,4 @@
 obj-$(CONFIG_CONNECTOR)+= cn.o
 obj-$(CONFIG_PROC_EVENTS)  += cn_proc.o
 
-cn-y   += cn_queue.o connector.o
+cn-y   += cn_hash.o cn_queue.o connector.o
diff --git a/drivers/connector/cn_hash.c b/drivers/connector/cn_hash.c
new file mode 100644
index ..b94f6c461496
--- /dev/null
+++ b/drivers/connector/cn_hash.c
@@ -0,0 +1,176 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Author: Anjali Kulkarni 
+ *
+ * Copyright (c) 2024 Oracle and/or its affiliates.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+struct cn_hash_dev *cn_hash_alloc_dev(const char *name)
+{
+   struct cn_hash_dev *hdev;
+
+   hdev = kzalloc(sizeof(*hdev), GFP_KERNEL);
+   if (!hdev)
+   return NULL;
+
+   snprintf(hdev->name, sizeof(hdev->name), "%s", name);
+   atomic_set(&hdev->hrefcnt, 0);
+   mutex_init(&hdev->uexit_hash_lock);
+   hash_init(hdev->uexit_pid_htable);
+   return hdev;
+}
+
+void cn_hash_free_dev(struct cn_hash_dev *hdev)
+{
+   struct uexit_pid_hnode *hnode;
+   struct hlist_node *tmp;
+   int bucket;
+
+   pr_debug("%s: Freeing entire hdev %p\n", __func__, hdev);
+
+   mutex_lock(&hdev->uexit_hash_lock);
+   hash_for_each_safe(hdev->uexit_pid_htable, bucket, tmp,
+  hnode, uexit_pid_hlist) {
+   hash_del(&hnode->uexit_pid_hlist);
+   pr_debug("%s: Freeing node for pid %d\n",
+__func__, hnode->pid);
+   kfree(hnode);
+   }
+
+   mutex_unlock(&hdev->uexit_hash_lock);
+   mutex_destroy(&hdev->uexit_hash_lock);
+
+   /*
+* This refcnt check is added in case CONFIG_CONNECTOR is
+* compiled with =m as a module. In that case, when unloading
+* the module, we need to make sure no hash entries are still
+* present in the hdev table.
+*/
+   while (atomic_read(&hdev->hrefcnt)) {
+   pr_info("Waiting for %s to become free: refcnt=%d\n",
+   hdev->name, atomic_read(&hdev->hrefcnt));
+   msleep(1000);
+   }
+
+   kfree(hdev);
+   hdev = NULL;
+}
+
+static struct uexit_pid_hnode *cn_hash_alloc_elem(__u32 uexit_code, pid_t pid)
+{
+   struct uexit_pid_hnode *elem;
+
+   elem = kzalloc(sizeof(*elem), GFP_KERNEL);
+   if (!elem)
+   return NULL;
+
+   INIT_HLIST_NODE(&elem->uexit_pid_hlist);
+   elem->uexit_code =

[PATCH net-next v6 0/3] Threads support in proc connector

2024-10-23 Thread Anjali Kulkarni

Recently we committed a fix to allow processes to receive notifications for
non-zero exits via the process connector module. Commit is a4c9a56e6a2c.

However, for threads, when it does a pthread_exit(&exit_status) call, the
kernel is not aware of the exit status with which pthread_exit is called.
It is sent by child thread to the parent process, if it is waiting in
pthread_join(). Hence, for a thread exiting abnormally, kernel cannot
send notifications to any listening processes.

The exception to this is if the thread is sent a signal which it has not
handled, and dies along with it's process as a result; for eg. SIGSEGV or
SIGKILL. In this case, kernel is aware of the non-zero exit and sends a
notification for it.

For our use case, we cannot have parent wait in pthread_join, one of the
main reasons for this being that we do not want to track normal
pthread_exit(), which could be a very large number. We only want to be
notified of any abnormal exits. Hence, threads are created with
pthread_attr_t set to PTHREAD_CREATE_DETACHED.

To fix this problem, we add a new type PROC_CN_MCAST_NOTIFY to proc connector
API, which allows a thread to send it's exit status to kernel either when
it needs to call pthread_exit() with non-zero value to indicate some
error or from signal handler before pthread_exit().

v5->v6 changes:
- As suggested by Simon Horman, fixed style issues (some old) by running
  ./scripts/checkpatch.pl --strict --max-line-length=80
- Removed inline functions as suggested by Simon Horman
- Added "depends on" in Kconfig.debug as suggested by Stanislav Fomichev
- Removed warning while compilation with kernel space headers as
  suggested by Simon Horman
- Removed "comm" field, will send separate patch for this.
- Added kunit configs in tools/testing/kunit/configs/all_tests.config
  with it's dependencies.

v4->v5 changes:
- Handled comment by Stanislav Fomichev to fix a print format error.
- Made thread.c completely automated by starting proc_filter program
  from within threads.c.
- Changed name CONFIG_CN_HASH_KUNIT_TEST to CN_HASH_KUNIT_TEST in
  Kconfig.debug and changed display text.

v3->v4 changes:
- Reduce size of exit.log by removing unnecessary text.

v2->v3 changes:
- Handled comment by Liam Howlett to set hdev to NULL and add comment on
  it.
- Handled comment by Liam Howlett to combine functions for deleting+get
  and deleting into one in cn_hash.c
- Handled comment by Liam Howlett to remove extern in the functions
  defined in cn_hash_test.h
- Some nits by Liam Howlett fixed.
- Handled comment by Liam Howlett to make threads test automated.
  proc_filter.c creates exit.log, which is read by thread.c and checks
  the values reported.
- Added "comm" field to struct proc_event, to copy the task's name to
  the packet to allow further filtering by packets.

v1->v2 changes:
- Handled comment by Peter Zijlstra to remove locking for PF_EXIT_NOTIFY
  task->flags.
- Added error handling in thread.c

v->v1 changes:
- Handled comment by Simon Horman to remove unused err in cn_proc.c
- Handled comment by Simon Horman to make adata and key_display static
  in cn_hash_test.c

Anjali Kulkarni (3):
  connector/cn_proc: Add hash table for threads
  connector/cn_proc: Kunit tests for threads hash table
  connector/cn_proc: Selftest for threads

 drivers/connector/Makefile|   2 +-
 drivers/connector/cn_hash.c   | 216 
 drivers/connector/cn_proc.c   |  81 --
 drivers/connector/connector.c |  88 ++-
 include/linux/connector.h |  62 -
 include/linux/sched.h |   2 +-
 include/uapi/linux/cn_proc.h  |   4 +-
 lib/Kconfig.debug |  19 ++
 lib/Makefile  |   1 +
 lib/cn_hash_test.c| 169 +
 lib/cn_hash_test.h|  10 +
 tools/testing/kunit/configs/all_tests.config  |   5 +
 tools/testing/selftests/connector/Makefile|  25 ++
 .../testing/selftests/connector/proc_filter.c |  63 +++--
 tools/testing/selftests/connector/thread.c| 238 ++
 .../selftests/connector/thread_filter.c   |  96 +++
 16 files changed, 1027 insertions(+), 54 deletions(-)
 create mode 100644 drivers/connector/cn_hash.c
 create mode 100644 lib/cn_hash_test.c
 create mode 100644 lib/cn_hash_test.h
 create mode 100644 tools/testing/selftests/connector/thread.c
 create mode 100644 tools/testing/selftests/connector/thread_filter.c

-- 
2.46.0

[PATCH net-next v6 3/3] connector/cn_proc: Selftest for threads

2024-10-23 Thread Anjali Kulkarni

Test to check if setting PROC_CN_MCAST_NOTIFY in proc connector API, allows
a thread's non-zero exit status to be returned to proc_filter.

The threads.c program creates 2 child threads. 1st thread handles signal
SIGSEGV, and 2nd thread needs to indicate some error condition (value 1)
to the kernel, instead of using pthread_exit() with 1.

In both cases, child sends notify_netlink_thread_exit(exit_code) to kernel,
to let kernel know it has exited abnormally with exit_code.

Compile:
make thread
make proc_filter
Run:
./threads

Signed-off-by: Anjali Kulkarni 
---
 tools/testing/selftests/connector/Makefile|  25 ++
 .../testing/selftests/connector/proc_filter.c |  63 +++--
 tools/testing/selftests/connector/thread.c| 238 ++
 .../selftests/connector/thread_filter.c   |  96 +++
 4 files changed, 403 insertions(+), 19 deletions(-)
 create mode 100644 tools/testing/selftests/connector/thread.c
 create mode 100644 tools/testing/selftests/connector/thread_filter.c

diff --git a/tools/testing/selftests/connector/Makefile 
b/tools/testing/selftests/connector/Makefile
index 92188b9bac5c..675015bdf8ee 100644
--- a/tools/testing/selftests/connector/Makefile
+++ b/tools/testing/selftests/connector/Makefile
@@ -1,6 +1,31 @@
 # SPDX-License-Identifier: GPL-2.0
+
+# Enable if you want to use build kernel header files instead of installed
+# kernel files
+# KERNEL="../../../.."
+# CFLAGS += -Wall $(KHDR_INCLUDES) -I $(KERNEL)/include/uapi -I 
$(KERNEL)/include
+
 CFLAGS += -Wall $(KHDR_INCLUDES)
 
+proc_filter: proc_filter.o
+   cc proc_filter.o -o proc_filter
+
+proc_filter.o: proc_filter.c
+   cc -c proc_filter.c -o proc_filter.o $(CFLAGS)
+
+thread: thread.o thread_filter.o
+   cc thread.o thread_filter.o -o thread
+
+thread.o: thread.c $(DEPS)
+   cc -c thread.c -o thread.o $(CFLAGS)
+
+thread_filter.o: thread_filter.c
+   cc -c thread_filter.c -o thread_filter.o $(CFLAGS)
+
+define EXTRA_CLEAN
+   rm *.o thread
+endef
+
 TEST_GEN_PROGS = proc_filter
 
 include ../lib.mk
diff --git a/tools/testing/selftests/connector/proc_filter.c 
b/tools/testing/selftests/connector/proc_filter.c
index 4a825b997666..32a655f95c7f 100644
--- a/tools/testing/selftests/connector/proc_filter.c
+++ b/tools/testing/selftests/connector/proc_filter.c
@@ -1,4 +1,9 @@
 // SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Author: Anjali Kulkarni 
+ *
+ * Copyright (c) 2024 Oracle and/or its affiliates.
+ */
 
 #include 
 #include 
@@ -28,6 +33,7 @@
 volatile static int interrupted;
 static int nl_sock, ret_errno, tcount;
 static struct epoll_event evn;
+FILE *file;
 
 static int filter;
 
@@ -37,6 +43,8 @@ static int filter;
 #define Printf ksft_print_msg
 #endif
 
+#define EXIT_LOG
+
 int send_message(void *pinp)
 {
char buff[NL_MESSAGE_SIZE];
@@ -146,58 +154,65 @@ int handle_packet(char *buff, int fd, struct proc_event 
*event)
tcount++;
switch (event->what) {
case PROC_EVENT_EXIT:
-   Printf("Exit process %d (tgid %d) with code %d, signal 
%d\n",
+#ifdef EXIT_LOG
+   fprintf(file, "pid %d tgid %d code %d\n",
+   event->event_data.exit.process_pid,
+   event->event_data.exit.process_tgid,
+   event->event_data.exit.exit_code);
+#else
+   Printf("Exit pid %d (tgid %d) exitcode %d, signal %d\n",
   event->event_data.exit.process_pid,
   event->event_data.exit.process_tgid,
   event->event_data.exit.exit_code,
   event->event_data.exit.exit_signal);
+#endif
break;
case PROC_EVENT_FORK:
-   Printf("Fork process %d (tgid %d), parent %d (tgid 
%d)\n",
+   Printf("Fork pid %d (tgid %d), parent %d (tgid %d)\n",
   event->event_data.fork.child_pid,
   event->event_data.fork.child_tgid,
   event->event_data.fork.parent_pid,
   event->event_data.fork.parent_tgid);
break;
case PROC_EVENT_EXEC:
-   Printf("Exec process %d (tgid %d)\n",
+   Printf("Exec pid %d (tgid %d)\n",
   event->event_data.exec.process_pid,
   event->event_data.exec.process_tgid);
break;
case PROC_EVENT_UID:
-   Printf("UID process %d (tgid %d) uid %d euid %d\n",
+   Printf("UID pid %d (tgid %d) uid %d euid %d\n",
   event->event_data.id.process_pid,
   event->event_data.id.process_tgid,
   event->event_data.id.r.ruid,

[PATCH net-next v5 04/12] selftests: ncdevmem: Make client_ip optional

2024-10-23 Thread Stanislav Fomichev

Support 3-tuple filtering by making client_ip optional. When -c is
not passed, don't specify src-ip/src-port in the filter.

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/net/ncdevmem.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
index 07222bfcdb07..051dd8bb0d3c 100644
--- a/tools/testing/selftests/net/ncdevmem.c
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -62,7 +62,7 @@
  */
 
 static char *server_ip = "192.168.1.4";
-static char *client_ip = "192.168.1.2";
+static char *client_ip;
 static char *port = "5201";
 static size_t do_validation;
 static int start_queue = 8;
@@ -236,8 +236,14 @@ static int configure_channels(unsigned int rx, unsigned 
int tx)
 
 static int configure_flow_steering(void)
 {
-   return run_command("sudo ethtool -N %s flow-type tcp4 src-ip %s dst-ip 
%s src-port %s dst-port %s queue %d >&2",
-  ifname, client_ip, server_ip, port, port, 
start_queue);
+   return run_command("sudo ethtool -N %s flow-type tcp4 %s %s dst-ip %s 
%s %s dst-port %s queue %d >&2",
+  ifname,
+  client_ip ? "src-ip" : "",
+  client_ip ?: "",
+  server_ip,
+  client_ip ? "src-port" : "",
+  client_ip ? port : "",
+  port, start_queue);
 }
 
 static int bind_rx_queue(unsigned int ifindex, unsigned int dmabuf_fd,
-- 
2.47.0

[PATCH AUTOSEL 6.6 03/23] selftests/bpf: Verify that sync_linked_regs preserves subreg_def

2024-10-23 Thread Sasha Levin

From: Eduard Zingerman 

[ Upstream commit a41b3828ec056a631ad22413d4560017fed5c3bd ]

This test was added because of a bug in verifier.c:sync_linked_regs(),
upon range propagation it destroyed subreg_def marks for registers.
The test is written in a way to return an upper half of a register
that is affected by range propagation and must have it's subreg_def
preserved. This gives a return value of 0 and leads to undefined
return value if subreg_def mark is not preserved.

Signed-off-by: Eduard Zingerman 
Signed-off-by: Andrii Nakryiko 
Signed-off-by: Daniel Borkmann 
Acked-by: Daniel Borkmann 
Link: https://lore.kernel.org/bpf/20240924210844.1758441-2-eddy...@gmail.com
Signed-off-by: Sasha Levin 
---
 .../selftests/bpf/progs/verifier_scalar_ids.c | 67 +++
 1 file changed, 67 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c 
b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
index 13b29a7faa71a..d24d3a36ec144 100644
--- a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
+++ b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
@@ -656,4 +656,71 @@ __naked void two_old_ids_one_cur_id(void)
: __clobber_all);
 }
 
+SEC("socket")
+/* Note the flag, see verifier.c:opt_subreg_zext_lo32_rnd_hi32() */
+__flag(BPF_F_TEST_RND_HI32)
+__success
+/* This test was added because of a bug in verifier.c:sync_linked_regs(),
+ * upon range propagation it destroyed subreg_def marks for registers.
+ * The subreg_def mark is used to decide whether zero extension instructions
+ * are needed when register is read. When BPF_F_TEST_RND_HI32 is set it
+ * also causes generation of statements to randomize upper halves of
+ * read registers.
+ *
+ * The test is written in a way to return an upper half of a register
+ * that is affected by range propagation and must have it's subreg_def
+ * preserved. This gives a return value of 0 and leads to undefined
+ * return value if subreg_def mark is not preserved.
+ */
+__retval(0)
+/* Check that verifier believes r1/r0 are zero at exit */
+__log_level(2)
+__msg("4: (77) r1 >>= 32 ; R1_w=0")
+__msg("5: (bf) r0 = r1   ; R0_w=0 R1_w=0")
+__msg("6: (95) exit")
+__msg("from 3 to 4")
+__msg("4: (77) r1 >>= 32 ; R1_w=0")
+__msg("5: (bf) r0 = r1   ; R0_w=0 R1_w=0")
+__msg("6: (95) exit")
+/* Verify that statements to randomize upper half of r1 had not been
+ * generated.
+ */
+__xlated("call unknown")
+__xlated("r0 &= 2147483647")
+__xlated("w1 = w0")
+/* This is how disasm.c prints BPF_ZEXT_REG at the moment, x86 and arm
+ * are the only CI archs that do not need zero extension for subregs.
+ */
+#if !defined(__TARGET_ARCH_x86) && !defined(__TARGET_ARCH_arm64)
+__xlated("w1 = w1")
+#endif
+__xlated("if w0 < 0xa goto pc+0")
+__xlated("r1 >>= 32")
+__xlated("r0 = r1")
+__xlated("exit")
+__naked void linked_regs_and_subreg_def(void)
+{
+   asm volatile (
+   "call %[bpf_ktime_get_ns];"
+   /* make sure r0 is in 32-bit range, otherwise w1 = w0 won't
+* assign same IDs to registers.
+*/
+   "r0 &= 0x7fff;"
+   /* link w1 and w0 via ID */
+   "w1 = w0;"
+   /* 'if' statement propagates range info from w0 to w1,
+* but should not affect w1->subreg_def property.
+*/
+   "if w0 < 10 goto +0;"
+   /* r1 is read here, on archs that require subreg zero
+* extension this would cause zext patch generation.
+*/
+   "r1 >>= 32;"
+   "r0 = r1;"
+   "exit;"
+   :
+   : __imm(bpf_ktime_get_ns)
+   : __clobber_all);
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.43.0

Re: [PATCH v4 2/4] pidfd: add PIDFD_SELF_* sentinels to refer to own thread/process

2024-10-23 Thread Shakeel Butt

On Wed, Oct 23, 2024 at 08:18:35AM GMT, Lorenzo Stoakes wrote:
> On Tue, Oct 22, 2024 at 05:53:00PM -0700, Shakeel Butt wrote:
> > On Thu, Oct 17, 2024 at 10:05:50PM GMT, Lorenzo Stoakes wrote:
> > > It is useful to be able to utilise the pidfd mechanism to reference the
> > > current thread or process (from a userland point of view - thread group
> > > leader from the kernel's point of view).
> > >
> > > Therefore introduce PIDFD_SELF_THREAD to refer to the current thread, and
> > > PIDFD_SELF_THREAD_GROUP to refer to the current thread group leader.
> > >
> > > For convenience and to avoid confusion from userland's perspective we 
> > > alias
> > > these:
> > >
> > > * PIDFD_SELF is an alias for PIDFD_SELF_THREAD - This is nearly always 
> > > what
> > >   the user will want to use, as they would find it surprising if for
> > >   instance fd's were unshared()'d and they wanted to invoke pidfd_getfd()
> > >   and that failed.
> > >
> > > * PIDFD_SELF_PROCESS is an alias for PIDFD_SELF_THREAD_GROUP - Most users
> > >   have no concept of thread groups or what a thread group leader is, and
> > >   from userland's perspective and nomenclature this is what userland
> > >   considers to be a process.
> >
> > Should users use PIDFD_SELF_PROCESS in process_madvise() for self
> > madvise() (once the support is added)?
> 
> You can use either it will make no difference as both will get you to
> current->mm which has to be shared. So I'd go with PIDFD_SELF for brevity
> :)
> 
> This series and the prerequisites I already added to process_madvise()
> already provide support so with this in you can just use this for that.

Thanks a lot, this is awesome. Is the plan for this series to go through
mm-tree or through Christian?

[PATCH net-next v5 12/12] selftests: ncdevmem: Add automated test

2024-10-23 Thread Stanislav Fomichev

Only RX side for now and small message to test the setup.
In the future, we can extend it to TX side and to testing
both sides with a couple of megs of data.

  make \
-C tools/testing/selftests \
TARGETS="drivers/hw/net" \
install INSTALL_PATH=~/tmp/ksft

  scp ~/tmp/ksft ${HOST}:
  scp ~/tmp/ksft ${PEER}:

  cfg+="NETIF=${DEV}\n"
  cfg+="LOCAL_V6=${HOST_IP}\n"
  cfg+="REMOTE_V6=${PEER_IP}\n"
  cfg+="REMOTE_TYPE=ssh\n"
  cfg+="REMOTE_ARGS=root@${PEER}\n"

  echo -e "$cfg" | ssh root@${HOST} "cat > ksft/drivers/net/net.config"
  ssh root@${HOST} "cd ksft && ./run_kselftest.sh -t drivers/net:devmem.py"

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 .../testing/selftests/drivers/net/hw/Makefile |  1 +
 .../selftests/drivers/net/hw/devmem.py| 45 +++
 2 files changed, 46 insertions(+)
 create mode 100755 tools/testing/selftests/drivers/net/hw/devmem.py

diff --git a/tools/testing/selftests/drivers/net/hw/Makefile 
b/tools/testing/selftests/drivers/net/hw/Makefile
index 182348f4bd40..1c6a77480923 100644
--- a/tools/testing/selftests/drivers/net/hw/Makefile
+++ b/tools/testing/selftests/drivers/net/hw/Makefile
@@ -3,6 +3,7 @@
 TEST_PROGS = \
csum.py \
devlink_port_split.py \
+   devmem.py \
ethtool.sh \
ethtool_extended_state.sh \
ethtool_mm.sh \
diff --git a/tools/testing/selftests/drivers/net/hw/devmem.py 
b/tools/testing/selftests/drivers/net/hw/devmem.py
new file mode 100755
index ..1416c31ff81e
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/hw/devmem.py
@@ -0,0 +1,45 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+from lib.py import ksft_run, ksft_exit
+from lib.py import ksft_eq, KsftSkipEx
+from lib.py import NetDrvEpEnv
+from lib.py import bkg, cmd, rand_port, wait_port_listen
+from lib.py import ksft_disruptive
+
+
+def require_devmem(cfg):
+if not hasattr(cfg, "_devmem_probed"):
+port = rand_port()
+probe_command = f"./ncdevmem -f {cfg.ifname}"
+cfg._devmem_supported = cmd(probe_command, fail=False, shell=True).ret 
== 0
+cfg._devmem_probed = True
+
+if not cfg._devmem_supported:
+raise KsftSkipEx("Test requires devmem support")
+
+
+@ksft_disruptive
+def check_rx(cfg) -> None:
+cfg.require_v6()
+require_devmem(cfg)
+
+port = rand_port()
+listen_cmd = f"./ncdevmem -l -f {cfg.ifname} -s {cfg.v6} -p {port}"
+
+with bkg(listen_cmd) as nc:
+wait_port_listen(port)
+cmd(f"echo -e \"hello\\nworld\"| nc {cfg.v6} {port}", host=cfg.remote, 
shell=True)
+
+ksft_eq(nc.stdout.strip(), "hello\nworld")
+
+
+def main() -> None:
+with NetDrvEpEnv(__file__) as cfg:
+ksft_run([check_rx],
+ args=(cfg, ))
+ksft_exit()
+
+
+if __name__ == "__main__":
+main()
-- 
2.47.0

[PATCH v2 2/2] soc: qcom: pmic_glink: Handle GLINK intent allocation rejections

2024-10-23 Thread Bjorn Andersson

Some versions of the pmic_glink firmware does not allow dynamic GLINK
intent allocations, attempting to send a message before the firmware has
allocated its receive buffers and announced these intent allocations
will fail. When this happens something like this showns up in the log:

pmic_glink_altmode.pmic_glink_altmode pmic_glink.altmode.0: failed to send 
altmode request: 0x10 (-125)
pmic_glink_altmode.pmic_glink_altmode pmic_glink.altmode.0: failed to 
request altmode notifications: -125
ucsi_glink.pmic_glink_ucsi pmic_glink.ucsi.0: failed to send UCSI read 
request: -125
qcom_battmgr.pmic_glink_power_supply pmic_glink.power-supply.0: failed to 
request power notifications

GLINK has been updated to distinguish between the cases where the remote
is going down (-ECANCELED) and the intent allocation being rejected
(-EAGAIN).

Retry the send until intent buffers becomes available, or an actual
error occur.

To avoid infinitely waiting for the firmware in the event that this
misbehaves and no intents arrive, an arbitrary 5 second timeout is
used.

This patch was developed with input from Chris Lew.

Reported-by: Johan Hovold 
Closes: https://lore.kernel.org/all/zqet8iinndhnx...@hovoldconsulting.com/#t
Cc: sta...@vger.kernel.org # rpmsg: glink: Handle rejected intent request better
Fixes: 58ef4ece1e41 ("soc: qcom: pmic_glink: Introduce base PMIC GLINK driver")
Tested-by: Johan Hovold 
Reviewed-by: Johan Hovold 
Signed-off-by: Bjorn Andersson 
---
 drivers/soc/qcom/pmic_glink.c | 25 ++---
 1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/soc/qcom/pmic_glink.c b/drivers/soc/qcom/pmic_glink.c
index 
9606222993fd78e80d776ea299cad024a0197e91..baa4ac6704a901661d1055c5caeaab61dc315795
 100644
--- a/drivers/soc/qcom/pmic_glink.c
+++ b/drivers/soc/qcom/pmic_glink.c
@@ -4,6 +4,7 @@
  * Copyright (c) 2022, Linaro Ltd
  */
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -13,6 +14,8 @@
 #include 
 #include 
 
+#define PMIC_GLINK_SEND_TIMEOUT (5 * HZ)
+
 enum {
PMIC_GLINK_CLIENT_BATT = 0,
PMIC_GLINK_CLIENT_ALTMODE,
@@ -112,13 +115,29 @@ EXPORT_SYMBOL_GPL(pmic_glink_client_register);
 int pmic_glink_send(struct pmic_glink_client *client, void *data, size_t len)
 {
struct pmic_glink *pg = client->pg;
+   bool timeout_reached = false;
+   unsigned long start;
int ret;
 
mutex_lock(&pg->state_lock);
-   if (!pg->ept)
+   if (!pg->ept) {
ret = -ECONNRESET;
-   else
-   ret = rpmsg_send(pg->ept, data, len);
+   } else {
+   start = jiffies;
+   for (;;) {
+   ret = rpmsg_send(pg->ept, data, len);
+   if (ret != -EAGAIN)
+   break;
+
+   if (timeout_reached) {
+   ret = -ETIMEDOUT;
+   break;
+   }
+
+   usleep_range(1000, 5000);
+   timeout_reached = time_after(jiffies, start + 
PMIC_GLINK_SEND_TIMEOUT);
+   }
+   }
mutex_unlock(&pg->state_lock);
 
return ret;

-- 
2.43.0

[PATCH v2 1/2] rpmsg: glink: Handle rejected intent request better

2024-10-23 Thread Bjorn Andersson

GLINK operates using pre-allocated buffers, aka intents, where incoming
messages are aggregated before being passed up the stack. In the case
that no suitable intents have been announced by the receiver, the sender
can request an intent to be allocated.

The initial implementation of the response to such request dealt
with two outcomes; granted allocations, and all other cases being
considered -ECANCELLED (likely from "cancelling the operation as the
remote is going down").

But on some channels intent allocation is not supported, instead the
remote will pre-allocate and announce a fixed number of intents for the
sender to use. If for such channels an rpmsg_send() is being invoked
before any channels have been announced, an intent request will be
issued and as this comes back rejected the call fails with -ECANCELED.

Given that this is reported in the same way as the remote being shut
down, there's no way for the client to differentiate the two cases.

In line with the original GLINK design, change the return value to
-EAGAIN for the case where the remote rejects an intent allocation
request.

It's tempting to handle this case in the GLINK core, as we expect
intents to show up in this case. But there's no way to distinguish
between this case and a rejection for a too big allocation, nor is it
possible to predict if a currently used (and seemingly suitable) intent
will be returned for reuse or not. As such, returning the error to the
client and allow it to react seems to be the only sensible solution.

In addition to this, commit 'c05dfce0b89e ("rpmsg: glink: Wait for
intent, not just request ack")' changed the logic such that the code
always wait for an intent request response and an intent. This works out
in most cases, but in the event that an intent request is rejected and no
further intent arrives (e.g. client asks for a too big intent), the code
will stall for 10 seconds and then return -ETIMEDOUT; instead of a more
suitable error.

This change also resulted in intent requests racing with the shutdown of
the remote would be exposed to this same problem, unless some intent
happens to arrive. A patch for this was developed and posted by Sarannya
S [1], and has been incorporated here.

To summarize, the intent request can end in 4 ways:
- Timeout, no response arrived => return -ETIMEDOUT
- Abort TX, the edge is going away => return -ECANCELLED
- Intent request was rejected => return -EAGAIN
- Intent request was accepted, and an intent arrived => return 0

This patch was developed with input from Sarannya S, Deepak Kumar Singh,
and Chris Lew.

[1] 
https://lore.kernel.org/all/20240925072328.1163183-1-quic_dee...@quicinc.com/

Fixes: c05dfce0b89e ("rpmsg: glink: Wait for intent, not just request ack")
Cc: sta...@vger.kernel.org
Tested-by: Johan Hovold 
Signed-off-by: Bjorn Andersson 
---
 drivers/rpmsg/qcom_glink_native.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/rpmsg/qcom_glink_native.c 
b/drivers/rpmsg/qcom_glink_native.c
index 
0b2f290069080638581a13b3a580054d31e176c2..d3af1dfa3c7d71b95dda911dfc7ad844679359d6
 100644
--- a/drivers/rpmsg/qcom_glink_native.c
+++ b/drivers/rpmsg/qcom_glink_native.c
@@ -1440,14 +1440,18 @@ static int qcom_glink_request_intent(struct qcom_glink 
*glink,
goto unlock;
 
ret = wait_event_timeout(channel->intent_req_wq,
-READ_ONCE(channel->intent_req_result) >= 0 &&
-READ_ONCE(channel->intent_received),
+READ_ONCE(channel->intent_req_result) == 0 ||
+(READ_ONCE(channel->intent_req_result) > 0 &&
+ READ_ONCE(channel->intent_received)) ||
+glink->abort_tx,
 10 * HZ);
if (!ret) {
dev_err(glink->dev, "intent request timed out\n");
ret = -ETIMEDOUT;
+   } else if (glink->abort_tx) {
+   ret = -ECANCELED;
} else {
-   ret = READ_ONCE(channel->intent_req_result) ? 0 : -ECANCELED;
+   ret = READ_ONCE(channel->intent_req_result) ? 0 : -EAGAIN;
}
 
 unlock:

-- 
2.43.0

Re: [PATCH net-next v3] selftest/tcp-ao: Add filter tests

2024-10-23 Thread Simon Horman

On Mon, Oct 21, 2024 at 10:46:44AM -0700, Leo Stone wrote:
> Add tests that check if getsockopt(TCP_AO_GET_KEYS) returns the right
> keys when using different filters.
> 
> Sample output:
> 
> > # ok 114 filter keys: by sndid, rcvid, address
> > # ok 115 filter keys: by is_current
> > # ok 116 filter keys: by is_rnext
> > # ok 117 filter keys: by sndid, rcvid
> > # ok 118 filter keys: correct nkeys when in.nkeys < matches
> 
> Acked-by: Dmitry Safonov <0x7f454...@gmail.com>
> Signed-off-by: Leo Stone 
> ---
> v3:
>   - Ordered locals in reverse xmas tree order
>   - Separated socket fd declaration from assignment
>   - Broke lines longer than 80 columns
> v2: https://lore.kernel.org/netdev/20241016055823.21299-1-leocst...@gmail.com/
>   - Changed 2 unnecessary test_error calls to test_fail
>   - Added another test to make sure getsockopt returns the right nkeys
>   value when the input nkeys is smaller than the number of matching keys
>   - Removed the TODO that this patch addresses
> v1: https://lore.kernel.org/netdev/20241014213313.15100-1-leocst...@gmail.com/
> 
> Thanks to the reviewers for their time and feedback!

Thanks for the updates.

Reviewed-by: Simon Horman

[PATCH AUTOSEL 6.11 20/30] selftests/bpf: Assert link info uprobe_multi count & path_size if unset

2024-10-23 Thread Sasha Levin

From: Tyrone Wu 

[ Upstream commit b836cbdf3b81a4a22b3452186efa2e5105a77e10 ]

Add assertions in `bpf_link_info.uprobe_multi` test to verify that
`count` and `path_size` fields are correctly populated when the fields
are unset.

This tests a previous bug where the `path_size` field was not populated
when `path` and `path_size` were unset.

Signed-off-by: Tyrone Wu 
Signed-off-by: Andrii Nakryiko 
Link: https://lore.kernel.org/bpf/20241011000803.681190-2-wudevel...@gmail.com
Signed-off-by: Sasha Levin 
---
 tools/testing/selftests/bpf/prog_tests/fill_link_info.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/fill_link_info.c 
b/tools/testing/selftests/bpf/prog_tests/fill_link_info.c
index f3932941bbaaf..59ef57145b63c 100644
--- a/tools/testing/selftests/bpf/prog_tests/fill_link_info.c
+++ b/tools/testing/selftests/bpf/prog_tests/fill_link_info.c
@@ -417,6 +417,15 @@ verify_umulti_link_info(int fd, bool retprobe, __u64 
*offsets,
if (!ASSERT_NEQ(err, -1, "readlink"))
return -1;
 
+   memset(&info, 0, sizeof(info));
+   err = bpf_link_get_info_by_fd(fd, &info, &len);
+   if (!ASSERT_OK(err, "bpf_link_get_info_by_fd"))
+   return -1;
+
+   ASSERT_EQ(info.uprobe_multi.count, 3, "info.uprobe_multi.count");
+   ASSERT_EQ(info.uprobe_multi.path_size, strlen(path) + 1,
+ "info.uprobe_multi.path_size");
+
for (bit = 0; bit < 8; bit++) {
memset(&info, 0, sizeof(info));
info.uprobe_multi.path = ptr_to_u64(path_buf);
-- 
2.43.0

Re: [PATCH v3] selftests: add new kallsyms selftests

2024-10-23 Thread Jeff Johnson

On 10/21/24 12:33, Luis Chamberlain wrote:
...
> +gen_template_module_exit()
> +{
> + cat < +static int __init auto_test_module_init(void)
> +{
> + return auto_runtime_test();
> +}
> +module_init(auto_test_module_init);
> +
> +static void __exit auto_test_module_exit(void)
> +{
> +}
> +module_exit(auto_test_module_exit);
> +
> +MODULE_AUTHOR("Luis Chamberlain ");
> +MODULE_LICENSE("GPL");
> +END_MODULE
> +}

Since commit 1fffe7a34c89 ("script: modpost: emit a warning when the
description is missing"), a module without a MODULE_DESCRIPTION() will
result in a warning when built with make W=1. Is that a concern here?
Should we add a MODULE_DESCRIPTION()?

/jeff

Re: [PATCH v4 14/19] gendwarfksyms: Add support for kABI rules

2024-10-23 Thread Sami Tolvanen

Hi,

On Tue, Oct 22, 2024 at 7:39 AM Petr Pavlu  wrote:
>
> On 10/8/24 20:38, Sami Tolvanen wrote:
> > +/*
> > + * KABI_USE_ARRAY(fqn)
> > + *   Treat the struct fqn as a declaration, i.e. even if a definition
> > + *   is available, don't expand the contents.
> > + */
> > +#define KABI_STRUCT_DECLONLY(fqn) __KABI_RULE(struct_declonly, fqn, ;)
>
> Nit: s/KABI_USE_ARRAY/KABI_STRUCT_DECLONLY/ in the preceding comment.

Thanks, I'll fix this in the next version.

> > + * Verify --stable output:
> > + *
> > + * RUN: echo -e "ex0\nex1" | \
> > + * RUN:   ./gendwarfksyms --stable --dump-dies \
> > + * RUN:  examples/kabi_rules.o 2>&1 >/dev/null | \
> > + * RUN:   FileCheck examples/kabi_rules.c --check-prefix=STABLE
> > + */
>
> It would be useful to make this test automated. Overall, I believe
> gendwarfksyms should have a set of automated tests to verify its
> functionality. At a minimum, I think we would want to work out some
> blueprint how to write them. Should they be added to kselftests, or
> would something like kconfig/tests be more appropriate? How to write
> tests with stable DWARF data that ideally work across all platforms?
> More tests can be then added incrementally.

Different compilers produce slightly different DWARF data, so we can't
necessarily guarantee that the output is the same even between
different compilers, let alone architectures, which makes automated
testing a bit more challenging. If we want tests for simple cases like
in these example files, it should be possible to work something out.
Otherwise, I think the best way to test the tool is to do build tests
and ensure that there are no warnings or errors, e.g. for missing
versions. Did you have any thoughts about the kinds of tests you'd
like to see?

> > +#define KABI_RULE_EMPTY_VALUE ";"
>
> Hmm, is there a reason why an empty value is ";" instead of just ""?

Not really, I can change this to an empty string in v5.

> > +
> > +/*
> > + * Rule: struct_declonly
> > + * - For the struct in the target field, treat it as a declaration
> > + *   only even if a definition is available.
> > + */
> > +#define KABI_RULE_TAG_STRUCT_DECLONLY "struct_declonly"
> > +
> > +/*
> > + * Rule: enumerator_ignore
> > + * - For the enum in the target field, ignore the named enumerator
> > + *   in the value field.
> > + */
> > +#define KABI_RULE_TAG_ENUMERATOR_IGNORE "enumerator_ignore"
> > +
> > +enum kabi_rule_type {
> > + KABI_RULE_TYPE_UNKNOWN,
> > + KABI_RULE_TYPE_STRUCT_DECLONLY,
> > + KABI_RULE_TYPE_ENUMERATOR_IGNORE,
> > +};
>
> Nit: All new KABI_* defines and the enum kabi_rule_type added in
> gendwarfksyms.h are used only locally from kabi.c, so they could be
> moved in that file.

True, I'll move these.

> > +struct rule {
> > + enum kabi_rule_type type;
> > + const char *target;
> > + const char *value;
> > + struct hlist_node hash;
> > +};
>
> What is the idea behind using 'const char *' instead of 'char *' for
> owned strings in structures?

I mentioned this in the previous response, but it's to make it more
obvious that the contents of these strings shouldn't be modified by
the users of this struct.

> > +static inline unsigned int rule_hash(enum kabi_rule_type type,
> > +  const char *target, const char *value)
> > +{
> > + return hash_32(type) ^ hash_str(target) ^ hash_str(value);
> > +}
> > +
> > +static inline unsigned int __rule_hash(const struct rule *rule)
> > +{
> > + return rule_hash(rule->type, rule->target, rule->value);
> > +}
>
> Nit: Perhaps call the two hash functions rule_values_hash() and
> rule_hash() to avoid the "__" prefix?

Sure, I'll rename the functions.

> As a general comment, I believe the gendwarfksyms code overuses the "__"
> prefix. Similarly, I find harder to navigate its code when, in a few
> instances, there is a function named _() and another as
> _(). An example of both would be the functions
> expand_type(), type_expand() and __type_expand().

I suspect this is a matter of personal preference. I don't have strong
feelings about the naming, but I'm happy to accept suggestions!

> > + scn = elf_nextscn(elf, NULL);
> > +
> > + while (scn) {
> > + shdr = gelf_getshdr(scn, &shdr_mem);
> > + if (shdr) {
>
> Isn't it an error when gelf_getshdr() returns NULL and as such it should
> be reported with error()? If this makes sense then the same handling
> should be implemented in symbols.c:elf_for_each_global().

Good point, I'll change this, also in symbols.c.

>
> > + const char *sname =
> > + elf_strptr(elf, shstrndx, shdr->sh_name);
> > +
> > + if (sname && !strcmp(sname, KABI_RULE_SECTION)) {
> > + rule_data = elf_getdata(scn, NULL);
>
> Similarly here for elf_strptr() and elf_getdata().

Ack.

Sami

Re: [PATCH] selftests: livepatch: add test cases of stack_order sysfs interface

2024-10-23 Thread Petr Mladek

On Fri 2024-10-11 23:11:51, Wardenjohn wrote:
> Add selftest test cases to sysfs attribute 'stack_order'.
> 
> Signed-off-by: Wardenjohn 
> ---
>  .../testing/selftests/livepatch/test-sysfs.sh | 74 +++
>  1 file changed, 74 insertions(+)
> 
> diff --git a/tools/testing/selftests/livepatch/test-sysfs.sh 
> b/tools/testing/selftests/livepatch/test-sysfs.sh
> index 05a14f5a7bfb..71a2e95636b1 100755
> --- a/tools/testing/selftests/livepatch/test-sysfs.sh
> +++ b/tools/testing/selftests/livepatch/test-sysfs.sh
> @@ -5,6 +5,8 @@
>  . $(dirname $0)/functions.sh
>  
>  MOD_LIVEPATCH=test_klp_livepatch
> +MOD_LIVEPATCH2=test_klp_callbacks_demo
> +MOD_LIVEPATCH3=test_klp_syscall
>  
>  setup_config
>  
> @@ -131,4 +133,76 @@ livepatch: '$MOD_LIVEPATCH': completing unpatching 
> transition
>  livepatch: '$MOD_LIVEPATCH': unpatching complete
>  % rmmod $MOD_LIVEPATCH"
>  
> +start_test "sysfs test stack_order read"

s/read/value/

> +
> +load_lp $MOD_LIVEPATCH
> +
> +check_sysfs_rights "$MOD_LIVEPATCH" "stack_order" "-r--r--r--"

The access rights should be rather tested in the 1st test in
test-sysfs.sh. We do not need to check it repeatedly here.

> +check_sysfs_value  "$MOD_LIVEPATCH" "stack_order" "1"
> +
> +load_lp $MOD_LIVEPATCH2
> +
> +check_sysfs_rights "$MOD_LIVEPATCH" "stack_order" "-r--r--r--"

Same here...

> +check_sysfs_value  "$MOD_LIVEPATCH2" "stack_order" "2"
> +
> +load_lp $MOD_LIVEPATCH3
> +
> +check_sysfs_rights "$MOD_LIVEPATCH3" "stack_order" "-r--r--r--"
> +check_sysfs_value  "$MOD_LIVEPATCH3" "stack_order" "3"
> +
> +disable_lp $MOD_LIVEPATCH2
> +unload_lp $MOD_LIVEPATCH2
> +
> +check_sysfs_rights "$MOD_LIVEPATCH" "stack_order" "-r--r--r--"
> +check_sysfs_value  "$MOD_LIVEPATCH" "stack_order" "1"
> +check_sysfs_rights "$MOD_LIVEPATCH3" "stack_order" "-r--r--r--"
> +check_sysfs_value  "$MOD_LIVEPATCH3" "stack_order" "2"
> +
> +disable_lp $MOD_LIVEPATCH3
> +unload_lp $MOD_LIVEPATCH3
> +
> +disable_lp $MOD_LIVEPATCH
> +unload_lp $MOD_LIVEPATCH

Otherwise, it looks good to me.

Just to make it clear, I suggest to do the following changes:

diff --git a/tools/testing/selftests/livepatch/test-sysfs.sh 
b/tools/testing/selftests/livepatch/test-sysfs.sh
index 71a2e95636b1..e44a051be307 100755
--- a/tools/testing/selftests/livepatch/test-sysfs.sh
+++ b/tools/testing/selftests/livepatch/test-sysfs.sh
@@ -21,6 +21,8 @@ check_sysfs_rights "$MOD_LIVEPATCH" "enabled" "-rw-r--r--"
 check_sysfs_value  "$MOD_LIVEPATCH" "enabled" "1"
 check_sysfs_rights "$MOD_LIVEPATCH" "force" "--w---"
 check_sysfs_rights "$MOD_LIVEPATCH" "replace" "-r--r--r--"
+check_sysfs_rights "$MOD_LIVEPATCH" "stack_order" "-r--r--r--"
+check_sysfs_value  "$MOD_LIVEPATCH" "stack_order" "1"
 check_sysfs_rights "$MOD_LIVEPATCH" "transition" "-r--r--r--"
 check_sysfs_value  "$MOD_LIVEPATCH" "transition" "0"
 check_sysfs_rights "$MOD_LIVEPATCH" "vmlinux/patched" "-r--r--r--"
@@ -133,29 +135,24 @@ livepatch: '$MOD_LIVEPATCH': completing unpatching 
transition
 livepatch: '$MOD_LIVEPATCH': unpatching complete
 % rmmod $MOD_LIVEPATCH"
 
-start_test "sysfs test stack_order read"
+start_test "sysfs test stack_order value"
 
 load_lp $MOD_LIVEPATCH
 
-check_sysfs_rights "$MOD_LIVEPATCH" "stack_order" "-r--r--r--"
 check_sysfs_value  "$MOD_LIVEPATCH" "stack_order" "1"
 
 load_lp $MOD_LIVEPATCH2
 
-check_sysfs_rights "$MOD_LIVEPATCH" "stack_order" "-r--r--r--"
 check_sysfs_value  "$MOD_LIVEPATCH2" "stack_order" "2"
 
 load_lp $MOD_LIVEPATCH3
 
-check_sysfs_rights "$MOD_LIVEPATCH3" "stack_order" "-r--r--r--"
 check_sysfs_value  "$MOD_LIVEPATCH3" "stack_order" "3"
 
 disable_lp $MOD_LIVEPATCH2
 unload_lp $MOD_LIVEPATCH2
 
-check_sysfs_rights "$MOD_LIVEPATCH" "stack_order" "-r--r--r--"
 check_sysfs_value  "$MOD_LIVEPATCH" "stack_order" "1"
-check_sysfs_rights "$MOD_LIVEPATCH3" "stack_order" "-r--r--r--"
 check_sysfs_value  "$MOD_LIVEPATCH3" "stack_order" "2"
 
 disable_lp $MOD_LIVEPATCH3


With the above changes:

Reviewed-by: Petr Mladek 
Tested-by: Petr Mladek 

Please, try to send the next version together with the patch adding
the "stack_order" attribute.

Best Regards,
Petr

Re: [PATCH v4 16/19] gendwarfksyms: Add support for symbol type pointers

2024-10-23 Thread Petr Pavlu

On 10/8/24 20:38, Sami Tolvanen wrote:
> The compiler may choose not to emit type information in DWARF for
> external symbols. Clang, for example, does this for symbols not
> defined in the current TU.
> 
> To provide a way to work around this issue, add support for
> __gendwarfksyms_ptr_ pointers that force the compiler to emit
> the necessary type information in DWARF also for the missing symbols.
> 
> Example usage:
> 
>   #define GENDWARFKSYMS_PTR(sym) \
>   static typeof(sym) *__gendwarfksyms_ptr_##sym __used  \
>   __section(".discard.gendwarfksyms") = &sym;
> 
>   extern int external_symbol(void);
>   GENDWARFKSYMS_PTR(external_symbol);
> 
> Signed-off-by: Sami Tolvanen 
> Acked-by: Neal Gompa 

Looks ok to me, feel free to add:

Reviewed-by: Petr Pavlu 

-- Petr

Re: [PATCH v4 18/19] kbuild: Add gendwarfksyms as an alternative to genksyms

2024-10-23 Thread Petr Pavlu

On 10/8/24 20:38, Sami Tolvanen wrote:
> When MODVERSIONS is enabled, allow selecting gendwarfksyms as the
> implementation, but default to genksyms.
> 
> Signed-off-by: Sami Tolvanen 
> Acked-by: Neal Gompa 
> ---
>  kernel/module/Kconfig  | 25 -
>  scripts/Makefile   |  2 +-
>  scripts/Makefile.build | 39 +++
>  3 files changed, 56 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/module/Kconfig b/kernel/module/Kconfig
> index f9e5f82fa88b..e6b2427e5c19 100644
> --- a/kernel/module/Kconfig
> +++ b/kernel/module/Kconfig
> @@ -169,13 +169,36 @@ config MODVERSIONS
> make them incompatible with the kernel you are running.  If
> unsure, say N.
>  
> +choice
> + prompt "Module versioning implementation"
> + depends on MODVERSIONS
> + default GENKSYMS
> + help
> +   Select the tool used to calculate symbol versions for modules.
> +
> +   If unsure, select GENKSYMS.
> +
> +config GENKSYMS
> + bool "genksyms (from source code)"
> + help
> +   Calculate symbol versions from pre-processed source code using
> +   genksyms.
> +
> +   If unsure, say Y.
> +
>  config GENDWARFKSYMS
> - bool
> + bool "gendwarfksyms (from debugging information)"
>   depends on DEBUG_INFO
>   # Requires full debugging information, split DWARF not supported.
>   depends on !DEBUG_INFO_REDUCED && !DEBUG_INFO_SPLIT
>   # Requires ELF object files.
>   depends on !LTO
> + help
> +   Calculate symbol versions from DWARF debugging information using
> +   gendwarfksyms. Requires DEBUG_INFO to be enabled.
> +
> +   If unsure, say N.
> +endchoice
>  
>  config ASM_MODVERSIONS
>   bool
> diff --git a/scripts/Makefile b/scripts/Makefile
> index d7fec46d38c0..8533f4498885 100644
> --- a/scripts/Makefile
> +++ b/scripts/Makefile
> @@ -53,7 +53,7 @@ hostprogs += unifdef
>  targets += module.lds
>  
>  subdir-$(CONFIG_GCC_PLUGINS) += gcc-plugins
> -subdir-$(CONFIG_MODVERSIONS) += genksyms
> +subdir-$(CONFIG_GENKSYMS) += genksyms
>  subdir-$(CONFIG_GENDWARFKSYMS) += gendwarfksyms
>  subdir-$(CONFIG_SECURITY_SELINUX) += selinux
>  subdir-$(CONFIG_SECURITY_IPE) += ipe
> diff --git a/scripts/Makefile.build b/scripts/Makefile.build
> index 8f423a1faf50..ae13afb71123 100644
> --- a/scripts/Makefile.build
> +++ b/scripts/Makefile.build
> @@ -107,18 +107,28 @@ cmd_cpp_i_c   = $(CPP) $(c_flags) -o $@ $<
>  $(obj)/%.i: $(obj)/%.c FORCE
>   $(call if_changed_dep,cpp_i_c)
>  
> +gendwarfksyms := scripts/gendwarfksyms/gendwarfksyms
> +getexportsymbols = $(NM) $(1) | sed -n 's/.* __export_symbol_\(.*\)/$(2)/p'
> +
>  genksyms = scripts/genksyms/genksyms \
>   $(if $(1), -T $(2)) \
>   $(if $(KBUILD_PRESERVE), -p)\
>   -r $(or $(wildcard $(2:.symtypes=.symref)), /dev/null)
>  
>  # These mirror gensymtypes_S and co below, keep them in synch.
> +ifdef CONFIG_GENDWARFKSYMS
> +symtypes_dep_c = $(obj)/%.o
> +cmd_gensymtypes_c = $(if $(skip_gendwarfksyms),, \
> + $(call getexportsymbols,$(2:.symtypes=.o),\1) | \
> + $(gendwarfksyms) $(2:.symtypes=.o) $(if $(1), --symtypes $(2)))

Is it possible to pass options to gendwarfksyms that apply to the entire
build, specifically, how can one say to use the --stable option? If not
then I think it would be good to add something as
KBUILD_GENDWARFKSYMS_STABLE (similar to KBUILD_PRESERVE), or maybe
a generic GENDWARFKSYMSFLAGS?

-- 
Thanks,
Petr

[PATCH net-next v5 10/12] selftests: ncdevmem: Run selftest when none of the -s or -c has been provided

2024-10-23 Thread Stanislav Fomichev

This will be used as a 'probe' mode in the selftest to check whether
the device supports the devmem or not. Use hard-coded queue layout
(two last queues) and prevent user from passing custom -q and/or -t.

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/net/ncdevmem.c | 42 --
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
index fe4d81ef1ca5..07a91516103a 100644
--- a/tools/testing/selftests/net/ncdevmem.c
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -76,7 +76,7 @@ static char *client_ip;
 static char *port;
 static size_t do_validation;
 static int start_queue = -1;
-static int num_queues = 1;
+static int num_queues = -1;
 static char *ifname;
 static unsigned int ifindex;
 static unsigned int dmabuf_id;
@@ -718,19 +718,31 @@ int main(int argc, char *argv[])
}
}
 
-   if (!server_ip)
-   error(1, 0, "Missing -s argument\n");
-
-   if (!port)
-   error(1, 0, "Missing -p argument\n");
-
if (!ifname)
error(1, 0, "Missing -f argument\n");
 
ifindex = if_nametoindex(ifname);
 
-   if (start_queue < 0) {
-   start_queue = rxq_num(ifindex) - 1;
+   if (!server_ip && !client_ip) {
+   if (start_queue < 0 && num_queues < 0) {
+   num_queues = rxq_num(ifindex);
+   if (num_queues < 0)
+   error(1, 0, "couldn't detect number of 
queues\n");
+   /* make sure can bind to multiple queues */
+   start_queue = num_queues / 2;
+   num_queues /= 2;
+   }
+
+   if (start_queue < 0 || num_queues < 0)
+   error(1, 0, "Both -t and -q are required\n");
+
+   run_devmem_tests();
+   return 0;
+   }
+
+   if (start_queue < 0 && num_queues < 0) {
+   num_queues = 1;
+   start_queue = rxq_num(ifindex) - num_queues;
 
if (start_queue < 0)
error(1, 0, "couldn't detect number of queues\n");
@@ -741,7 +753,17 @@ int main(int argc, char *argv[])
for (; optind < argc; optind++)
fprintf(stderr, "extra arguments: %s\n", argv[optind]);
 
-   run_devmem_tests();
+   if (start_queue < 0)
+   error(1, 0, "Missing -t argument\n");
+
+   if (num_queues < 0)
+   error(1, 0, "Missing -q argument\n");
+
+   if (!server_ip)
+   error(1, 0, "Missing -s argument\n");
+
+   if (!port)
+   error(1, 0, "Missing -p argument\n");
 
mem = provider->alloc(getpagesize() * NUM_PAGES);
ret = is_server ? do_server(mem) : 1;
-- 
2.47.0

[PATCH net-next v5 09/12] selftests: ncdevmem: Remove hard-coded queue numbers

2024-10-23 Thread Stanislav Fomichev

Use single last queue of the device and probe it dynamically.

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/net/ncdevmem.c | 40 --
 1 file changed, 38 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
index e1faad46548b..fe4d81ef1ca5 100644
--- a/tools/testing/selftests/net/ncdevmem.c
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -75,8 +75,8 @@ static char *server_ip;
 static char *client_ip;
 static char *port;
 static size_t do_validation;
-static int start_queue = 8;
-static int num_queues = 8;
+static int start_queue = -1;
+static int num_queues = 1;
 static char *ifname;
 static unsigned int ifindex;
 static unsigned int dmabuf_id;
@@ -208,6 +208,33 @@ void validate_buffer(void *line, size_t size)
fprintf(stdout, "Validated buffer\n");
 }
 
+static int rxq_num(int ifindex)
+{
+   struct ethtool_channels_get_req *req;
+   struct ethtool_channels_get_rsp *rsp;
+   struct ynl_error yerr;
+   struct ynl_sock *ys;
+   int num = -1;
+
+   ys = ynl_sock_create(&ynl_ethtool_family, &yerr);
+   if (!ys) {
+   fprintf(stderr, "YNL: %s\n", yerr.msg);
+   return -1;
+   }
+
+   req = ethtool_channels_get_req_alloc();
+   ethtool_channels_get_req_set_header_dev_index(req, ifindex);
+   rsp = ethtool_channels_get(ys, req);
+   if (rsp)
+   num = rsp->rx_count + rsp->combined_count;
+   ethtool_channels_get_req_free(req);
+   ethtool_channels_get_rsp_free(rsp);
+
+   ynl_sock_destroy(ys);
+
+   return num;
+}
+
 #define run_command(cmd, ...)   \
({  \
char command[256];  \
@@ -702,6 +729,15 @@ int main(int argc, char *argv[])
 
ifindex = if_nametoindex(ifname);
 
+   if (start_queue < 0) {
+   start_queue = rxq_num(ifindex) - 1;
+
+   if (start_queue < 0)
+   error(1, 0, "couldn't detect number of queues\n");
+
+   fprintf(stderr, "using queues %d..%d\n", start_queue, 
start_queue + num_queues);
+   }
+
for (; optind < argc; optind++)
fprintf(stderr, "extra arguments: %s\n", argv[optind]);
 
-- 
2.47.0

[PATCH net-next v5 11/12] selftests: ncdevmem: Move ncdevmem under drivers/net/hw

2024-10-23 Thread Stanislav Fomichev

This is where all the tests that depend on the HW functionality live in
and this is where the automated test is gonna be added in the next
patch.

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/drivers/net/hw/.gitignore | 1 +
 tools/testing/selftests/drivers/net/hw/Makefile   | 8 
 .../testing/selftests/{net => drivers/net/hw}/ncdevmem.c  | 0
 tools/testing/selftests/net/.gitignore| 1 -
 tools/testing/selftests/net/Makefile  | 8 
 5 files changed, 9 insertions(+), 9 deletions(-)
 create mode 100644 tools/testing/selftests/drivers/net/hw/.gitignore
 rename tools/testing/selftests/{net => drivers/net/hw}/ncdevmem.c (100%)

diff --git a/tools/testing/selftests/drivers/net/hw/.gitignore 
b/tools/testing/selftests/drivers/net/hw/.gitignore
new file mode 100644
index ..e9fe6ede681a
--- /dev/null
+++ b/tools/testing/selftests/drivers/net/hw/.gitignore
@@ -0,0 +1 @@
+ncdevmem
diff --git a/tools/testing/selftests/drivers/net/hw/Makefile 
b/tools/testing/selftests/drivers/net/hw/Makefile
index c9f2f48fc30f..182348f4bd40 100644
--- a/tools/testing/selftests/drivers/net/hw/Makefile
+++ b/tools/testing/selftests/drivers/net/hw/Makefile
@@ -26,4 +26,12 @@ TEST_INCLUDES := \
../../../net/forwarding/tc_common.sh \
#
 
+# YNL files, must be before "include ..lib.mk"
+YNL_GEN_FILES := ncdevmem
+TEST_GEN_FILES += $(YNL_GEN_FILES)
+
 include ../../../lib.mk
+
+# YNL build
+YNL_GENS := ethtool netdev
+include ../../../net/ynl.mk
diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/drivers/net/hw/ncdevmem.c
similarity index 100%
rename from tools/testing/selftests/net/ncdevmem.c
rename to tools/testing/selftests/drivers/net/hw/ncdevmem.c
diff --git a/tools/testing/selftests/net/.gitignore 
b/tools/testing/selftests/net/.gitignore
index 217d8b7a7365..a78debbd1fe7 100644
--- a/tools/testing/selftests/net/.gitignore
+++ b/tools/testing/selftests/net/.gitignore
@@ -18,7 +18,6 @@ ipv6_flowlabel_mgr
 log.txt
 msg_oob
 msg_zerocopy
-ncdevmem
 nettest
 psock_fanout
 psock_snd
diff --git a/tools/testing/selftests/net/Makefile 
b/tools/testing/selftests/net/Makefile
index 759b1d2dc8b4..22a5d6a7c3f3 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -97,10 +97,6 @@ TEST_PROGS += fq_band_pktlimit.sh
 TEST_PROGS += vlan_hw_filter.sh
 TEST_PROGS += bpf_offload.py
 
-# YNL files, must be before "include ..lib.mk"
-YNL_GEN_FILES := ncdevmem
-TEST_GEN_FILES += $(YNL_GEN_FILES)
-
 TEST_FILES := settings
 TEST_FILES += in_netns.sh lib.sh net_helper.sh setup_loopback.sh setup_veth.sh
 
@@ -110,10 +106,6 @@ TEST_INCLUDES := forwarding/lib.sh
 
 include ../lib.mk
 
-# YNL build
-YNL_GENS := ethtool netdev
-include ynl.mk
-
 $(OUTPUT)/epoll_busy_poll: LDLIBS += -lcap
 $(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma
 $(OUTPUT)/tcp_mmap: LDLIBS += -lpthread -lcrypto
-- 
2.47.0

[PATCH net-next v5 07/12] selftests: ncdevmem: Properly reset flow steering

2024-10-23 Thread Stanislav Fomichev

ntuple off/on might be not enough to do it on all NICs.
Add a bunch of shell crap to explicitly remove the rules.

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/net/ncdevmem.c | 19 ---
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
index 8e4a0fe74bb1..697771c1f9fa 100644
--- a/tools/testing/selftests/net/ncdevmem.c
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -217,13 +217,18 @@ void validate_buffer(void *line, size_t size)
 
 static int reset_flow_steering(void)
 {
-   int ret = 0;
-
-   ret = run_command("sudo ethtool -K %s ntuple off >&2", ifname);
-   if (ret)
-   return ret;
-
-   return run_command("sudo ethtool -K %s ntuple on >&2", ifname);
+   /* Depending on the NIC, toggling ntuple off and on might not
+* be allowed. Additionally, attempting to delete existing filters
+* will fail if no filters are present. Therefore, do not enforce
+* the exit status.
+*/
+
+   run_command("sudo ethtool -K %s ntuple off >&2", ifname);
+   run_command("sudo ethtool -K %s ntuple on >&2", ifname);
+   run_command(
+   "sudo ethtool -n %s | grep 'Filter:' | awk '{print $2}' | xargs 
-n1 ethtool -N %s delete >&2",
+   ifname, ifname);
+   return 0;
 }
 
 static int configure_headersplit(bool on)
-- 
2.47.0

[PATCH net-next v5 00/12] selftests: ncdevmem: Add ncdevmem to ksft

2024-10-23 Thread Stanislav Fomichev

The goal of the series is to simplify and make it possible to use
ncdevmem in an automated way from the ksft python wrapper.

ncdevmem is slowly mutated into a state where it uses stdout
to print the payload and the python wrapper is added to
make sure the arrived payload matches the expected one.

v5:
- properly handle errors from inet_pton() and socket() (Paolo)
- remove unneeded import from python selftest (Paolo)

v4:
- keep usage example with validation (Mina)
- fix compilation issue in one patch (s/start_queues/start_queue/)

v3:
- keep and refine the comment about ncdevmem invocation (Mina)
- add the comment about not enforcing exit status for ntuple reset (Mina)
- make configure_headersplit more robust (Mina)
- use num_queues/2 in selftest and let the users override it (Mina)
- remove memory_provider.memcpy_to_device (Mina)
- keep ksft as is (don't use -v validate flags): we are gonna
  need a --debug-disable flag to make it less chatty; otherwise
  it times out when sending too much data; so leaving it as
  a separate follow up

v2:
- don't remove validation (Mina)
- keep 5-tuple flow steering but use it only when -c is provided (Mina)
- remove separate flag for probing (Mina)
- move ncdevmem under drivers/net/hw, not drivers/net (Jakub)

Cc: Mina Almasry 

Stanislav Fomichev (12):
  selftests: ncdevmem: Redirect all non-payload output to stderr
  selftests: ncdevmem: Separate out dmabuf provider
  selftests: ncdevmem: Unify error handling
  selftests: ncdevmem: Make client_ip optional
  selftests: ncdevmem: Remove default arguments
  selftests: ncdevmem: Switch to AF_INET6
  selftests: ncdevmem: Properly reset flow steering
  selftests: ncdevmem: Use YNL to enable TCP header split
  selftests: ncdevmem: Remove hard-coded queue numbers
  selftests: ncdevmem: Run selftest when none of the -s or -c has been
provided
  selftests: ncdevmem: Move ncdevmem under drivers/net/hw
  selftests: ncdevmem: Add automated test

 .../selftests/drivers/net/hw/.gitignore   |   1 +
 .../testing/selftests/drivers/net/hw/Makefile |   9 +
 .../selftests/drivers/net/hw/devmem.py|  45 +
 .../selftests/drivers/net/hw/ncdevmem.c   | 773 ++
 tools/testing/selftests/net/.gitignore|   1 -
 tools/testing/selftests/net/Makefile  |   8 -
 tools/testing/selftests/net/ncdevmem.c| 570 -
 7 files changed, 828 insertions(+), 579 deletions(-)
 create mode 100644 tools/testing/selftests/drivers/net/hw/.gitignore
 create mode 100755 tools/testing/selftests/drivers/net/hw/devmem.py
 create mode 100644 tools/testing/selftests/drivers/net/hw/ncdevmem.c
 delete mode 100644 tools/testing/selftests/net/ncdevmem.c

-- 
2.47.0

[PATCH net-next v5 05/12] selftests: ncdevmem: Remove default arguments

2024-10-23 Thread Stanislav Fomichev

To make it clear what's required and what's not. Also, some of the
values don't seem like a good defaults; for example eth1.

Move the invocation comment to the top, add missing -s to the client
and cleanup the client invocation a bit to make more readable.

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/net/ncdevmem.c | 61 --
 1 file changed, 39 insertions(+), 22 deletions(-)

diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
index 051dd8bb0d3c..b81b8de165f0 100644
--- a/tools/testing/selftests/net/ncdevmem.c
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -1,4 +1,31 @@
 // SPDX-License-Identifier: GPL-2.0
+/*
+ * tcpdevmem netcat. Works similarly to netcat but does device memory TCP
+ * instead of regular TCP. Uses udmabuf to mock a dmabuf provider.
+ *
+ * Usage:
+ *
+ * On server:
+ * ncdevmem -s  [-c ] -f eth1 -l -p 5201
+ *
+ * On client:
+ * echo -n "hello\nworld" | nc -s  5201 -p 5201
+ *
+ * Test data validation:
+ *
+ * On server:
+ * ncdevmem -s  [-c ] -f eth1 -l -p 5201 -v 7
+ *
+ * On client:
+ * yes $(echo -e \\x01\\x02\\x03\\x04\\x05\\x06) | \
+ * tr \\n \\0 | \
+ * head -c 5G | \
+ * nc  5201 -p 5201
+ *
+ *
+ * Note this is compatible with regular netcat. i.e. the sender or receiver can
+ * be replaced with regular netcat to test the RX or TX path in isolation.
+ */
 #define _GNU_SOURCE
 #define __EXPORTED_HEADERS__
 
@@ -42,32 +69,13 @@
 #define MSG_SOCK_DEVMEM 0x200
 #endif
 
-/*
- * tcpdevmem netcat. Works similarly to netcat but does device memory TCP
- * instead of regular TCP. Uses udmabuf to mock a dmabuf provider.
- *
- * Usage:
- *
- * On server:
- * ncdevmem -s  -c  -f eth1 -l -p 5201 -v 7
- *
- * On client:
- * yes $(echo -e \\x01\\x02\\x03\\x04\\x05\\x06) | \
- * tr \\n \\0 | \
- * head -c 5G | \
- * nc  5201 -p 5201
- *
- * Note this is compatible with regular netcat. i.e. the sender or receiver can
- * be replaced with regular netcat to test the RX or TX path in isolation.
- */
-
-static char *server_ip = "192.168.1.4";
+static char *server_ip;
 static char *client_ip;
-static char *port = "5201";
+static char *port;
 static size_t do_validation;
 static int start_queue = 8;
 static int num_queues = 8;
-static char *ifname = "eth1";
+static char *ifname;
 static unsigned int ifindex;
 static unsigned int dmabuf_id;
 
@@ -596,6 +604,15 @@ int main(int argc, char *argv[])
}
}
 
+   if (!server_ip)
+   error(1, 0, "Missing -s argument\n");
+
+   if (!port)
+   error(1, 0, "Missing -p argument\n");
+
+   if (!ifname)
+   error(1, 0, "Missing -f argument\n");
+
ifindex = if_nametoindex(ifname);
 
for (; optind < argc; optind++)
-- 
2.47.0

Re: [PATCH] selftests: livepatch: add test cases of stack_order sysfs interface

2024-10-23 Thread zhang warden

Hi, Petr.
> On Oct 23, 2024, at 20:38, Petr Mladek  wrote:
> 
> Please, try to send the next version together with the patch adding
> the "stack_order" attribute.
> 

Do you mean I should resend the patch of "livepatch: Add stack_order sysfs 
attribute" again with this "livepatch selftest" patch?

Regards.
Wardenjohn.

[PATCH AUTOSEL 6.11 03/30] selftests/bpf: Verify that sync_linked_regs preserves subreg_def

2024-10-23 Thread Sasha Levin

From: Eduard Zingerman 

[ Upstream commit a41b3828ec056a631ad22413d4560017fed5c3bd ]

This test was added because of a bug in verifier.c:sync_linked_regs(),
upon range propagation it destroyed subreg_def marks for registers.
The test is written in a way to return an upper half of a register
that is affected by range propagation and must have it's subreg_def
preserved. This gives a return value of 0 and leads to undefined
return value if subreg_def mark is not preserved.

Signed-off-by: Eduard Zingerman 
Signed-off-by: Andrii Nakryiko 
Signed-off-by: Daniel Borkmann 
Acked-by: Daniel Borkmann 
Link: https://lore.kernel.org/bpf/20240924210844.1758441-2-eddy...@gmail.com
Signed-off-by: Sasha Levin 
---
 .../selftests/bpf/progs/verifier_scalar_ids.c | 67 +++
 1 file changed, 67 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c 
b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
index 13b29a7faa71a..d24d3a36ec144 100644
--- a/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
+++ b/tools/testing/selftests/bpf/progs/verifier_scalar_ids.c
@@ -656,4 +656,71 @@ __naked void two_old_ids_one_cur_id(void)
: __clobber_all);
 }
 
+SEC("socket")
+/* Note the flag, see verifier.c:opt_subreg_zext_lo32_rnd_hi32() */
+__flag(BPF_F_TEST_RND_HI32)
+__success
+/* This test was added because of a bug in verifier.c:sync_linked_regs(),
+ * upon range propagation it destroyed subreg_def marks for registers.
+ * The subreg_def mark is used to decide whether zero extension instructions
+ * are needed when register is read. When BPF_F_TEST_RND_HI32 is set it
+ * also causes generation of statements to randomize upper halves of
+ * read registers.
+ *
+ * The test is written in a way to return an upper half of a register
+ * that is affected by range propagation and must have it's subreg_def
+ * preserved. This gives a return value of 0 and leads to undefined
+ * return value if subreg_def mark is not preserved.
+ */
+__retval(0)
+/* Check that verifier believes r1/r0 are zero at exit */
+__log_level(2)
+__msg("4: (77) r1 >>= 32 ; R1_w=0")
+__msg("5: (bf) r0 = r1   ; R0_w=0 R1_w=0")
+__msg("6: (95) exit")
+__msg("from 3 to 4")
+__msg("4: (77) r1 >>= 32 ; R1_w=0")
+__msg("5: (bf) r0 = r1   ; R0_w=0 R1_w=0")
+__msg("6: (95) exit")
+/* Verify that statements to randomize upper half of r1 had not been
+ * generated.
+ */
+__xlated("call unknown")
+__xlated("r0 &= 2147483647")
+__xlated("w1 = w0")
+/* This is how disasm.c prints BPF_ZEXT_REG at the moment, x86 and arm
+ * are the only CI archs that do not need zero extension for subregs.
+ */
+#if !defined(__TARGET_ARCH_x86) && !defined(__TARGET_ARCH_arm64)
+__xlated("w1 = w1")
+#endif
+__xlated("if w0 < 0xa goto pc+0")
+__xlated("r1 >>= 32")
+__xlated("r0 = r1")
+__xlated("exit")
+__naked void linked_regs_and_subreg_def(void)
+{
+   asm volatile (
+   "call %[bpf_ktime_get_ns];"
+   /* make sure r0 is in 32-bit range, otherwise w1 = w0 won't
+* assign same IDs to registers.
+*/
+   "r0 &= 0x7fff;"
+   /* link w1 and w0 via ID */
+   "w1 = w0;"
+   /* 'if' statement propagates range info from w0 to w1,
+* but should not affect w1->subreg_def property.
+*/
+   "if w0 < 10 goto +0;"
+   /* r1 is read here, on archs that require subreg zero
+* extension this would cause zext patch generation.
+*/
+   "r1 >>= 32;"
+   "r0 = r1;"
+   "exit;"
+   :
+   : __imm(bpf_ktime_get_ns)
+   : __clobber_all);
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.43.0

Re: [PATCH v4 15/19] gendwarfksyms: Add support for reserved and ignored fields

2024-10-23 Thread Petr Pavlu

On 10/8/24 20:38, Sami Tolvanen wrote:
> Distributions that want to maintain a stable kABI need the ability
> to make ABI compatible changes to kernel data structures without
> affecting symbol versions, either because of LTS updates or backports.
> 
> With genksyms, developers would typically hide these changes from
> version calculation with #ifndef __GENKSYMS__, which would result
> in the symbol version not changing even though the actual type has
> changed.  When we process precompiled object files, this isn't an
> option.
> 
> Change union processing to recognize field name prefixes that allow
> the user to ignore the union completely during symbol versioning with
> a __kabi_ignored prefix in a field name, or to replace the type of a
> placeholder field using a __kabi_reserved field name prefix.
> 
> For example, assume we want to add a new field to an existing
> alignment hole in a data structure, and ignore the new field when
> calculating symbol versions:
> 
>   struct struct1 {
> int a;
> /* a 4-byte alignment hole */
> unsigned long b;
>   };
> 
> To add `int n` to the alignment hole, we can add a union that includes
> a __kabi_ignored field that causes gendwarfksyms to ignore the entire
> union:
> 
>   struct struct1 {
> int a;
> union {
>   char __kabi_ignored_0;
>   int n;
> };
> unsigned long b;
>   };
> 
> With --stable, both structs produce the same symbol version.
> 
> Alternatively, when a distribution expects future modification to a
> data structure, they can explicitly add reserved fields:
> 
>   struct struct2 {
> long a;
> long __kabi_reserved_0; /* reserved for future use */
>   };
> 
> To take the field into use, we can again replace it with a union, with
> one of the fields keeping the __kabi_reserved name prefix to indicate
> the original type:
> 
>   struct struct2 {
> long a;
> union {
>   long __kabi_reserved_0;
>   struct {
>   int b;
>   int v;
>   };
> };
> 
> Here gendwarfksyms --stable replaces the union with the type of the
> placeholder field when calculating versions.
> 
> Signed-off-by: Sami Tolvanen 
> Acked-by: Neal Gompa 
> ---
>  scripts/gendwarfksyms/dwarf.c | 202 +-
>  scripts/gendwarfksyms/examples/kabi.h |  80 +
>  scripts/gendwarfksyms/examples/kabi_ex0.c |  86 +
>  scripts/gendwarfksyms/examples/kabi_ex1.c |  89 ++
>  scripts/gendwarfksyms/examples/kabi_ex2.c |  98 +++
>  scripts/gendwarfksyms/gendwarfksyms.h |  29 
>  6 files changed, 583 insertions(+), 1 deletion(-)
>  create mode 100644 scripts/gendwarfksyms/examples/kabi_ex0.c
>  create mode 100644 scripts/gendwarfksyms/examples/kabi_ex1.c
>  create mode 100644 scripts/gendwarfksyms/examples/kabi_ex2.c
> 
> diff --git a/scripts/gendwarfksyms/dwarf.c b/scripts/gendwarfksyms/dwarf.c
> index b15f1a5db452..72e24140b6e3 100644
> --- a/scripts/gendwarfksyms/dwarf.c
> +++ b/scripts/gendwarfksyms/dwarf.c
> @@ -308,6 +308,9 @@ static void __process_list_type(struct state *state, 
> struct die *cache,
>  {
>   const char *name = get_name_attr(die);
>  
> + if (stable && is_kabi_prefix(name))
> + name = NULL;
> +
>   process_list_comma(state, cache);
>   process(cache, type);
>   process_type_attr(state, cache, die);
> @@ -441,11 +444,193 @@ static void process_variant_part_type(struct state 
> *state, struct die *cache,
>   process(cache, "}");
>  }
>  
> +static int get_kabi_status(Dwarf_Die *die)
> +{
> + const char *name = get_name_attr(die);
> +
> + if (is_kabi_prefix(name)) {
> + name += KABI_PREFIX_LEN;
> +
> + if (!strncmp(name, KABI_RESERVED_PREFIX,
> +  KABI_RESERVED_PREFIX_LEN))
> + return KABI_RESERVED;
> + if (!strncmp(name, KABI_IGNORED_PREFIX,
> +  KABI_IGNORED_PREFIX_LEN))
> + return KABI_IGNORED;
> + }
> +
> + return KABI_NORMAL;
> +}
> +
> +static int check_struct_member_kabi_status(struct state *state,
> +struct die *__unused, Dwarf_Die *die)
> +{
> + int res;
> +
> + if (dwarf_tag(die) != DW_TAG_member_type)
> + error("expected a member");

Nit: If I understand the code correctly, the failed tag check here would
indicate an error in the internal logic that should never happen, so
a plain assert() would be more appropriate? Similarly in
check_union_member_kabi_status().

> +
> + /*
> +  * If the union member is a struct, expect the __kabi field to
> +  * be the first member of the structure, i.e..:
> +  *
> +  * union {
> +  *  type new_member;
> +  *  struct {
> +  *  type __kabi_field;
> +  *  }
> +  * };
> +  */
> + res = get_kabi_status(die);
> +
> + if (res == KABI_RESERVED &&
> + !get_ref_die_attr(die, DW_AT_type, &sta

Re: [PATCH 1/4] dt-bindings: remoteproc: fsl,imx-rproc: add new compatible

2024-10-23 Thread Frank Li

On Wed, Oct 23, 2024 at 12:21:11PM -0400, Laurentiu Mihalcea wrote:
> From: Laurentiu Mihalcea 
>
> Add new compatible for imx95's CM7 with SOF.

It is not only add compatible string, but also change reg, reg-names ...

Please add descripts in commit message about these.

>
> Signed-off-by: Laurentiu Mihalcea 
> ---
>  .../bindings/remoteproc/fsl,imx-rproc.yaml| 58 +--
>  1 file changed, 53 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml 
> b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
> index 57d75acb0b5e..ab0d8e017965 100644
> --- a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
> +++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
> @@ -28,6 +28,15 @@ properties:
>- fsl,imx8qxp-cm4
>- fsl,imx8ulp-cm33
>- fsl,imx93-cm33
> +  - fsl,imx95-cm7-sof
> +
> +  reg:
> +maxItems: 2
> +
> +  reg-names:
> +items:
> +  - const: dram
> +  - const: mailbox
>
>clocks:
>  maxItems: 1
> @@ -38,10 +47,8 @@ properties:
>Phandle to syscon block which provide access to System Reset Controller
>
>mbox-names:
> -items:
> -  - const: tx
> -  - const: rx
> -  - const: rxdb
> +minItems: 1
> +maxItems: 4
>
>mboxes:
>  description:
> @@ -49,7 +56,7 @@ properties:
>List of <&phandle type channel> - 1 channel for TX, 1 channel for RX, 
> 1 channel for RXDB.
>(see mailbox/fsl,mu.yaml)
>  minItems: 1
> -maxItems: 3
> +maxItems: 4
>
>memory-region:
>  description:
> @@ -84,6 +91,10 @@ properties:
>This property is to specify the resource id of the remote processor in 
> SoC
>which supports SCFW
>
> +  port:
> +$ref: /schemas/sound/audio-graph-port.yaml#
> +unevaluatedProperties: false
> +
>  required:
>- compatible
>
> @@ -114,6 +125,43 @@ allOf:
>properties:
>  power-domains: false
>
> +  - if:
> +  properties:
> +compatible:
> +  contains:
> +const: fsl,imx95-cm7-sof
> +then:
> +  properties:
> +mboxes:
> +  minItems: 4
> +mbox-names:
> +  items:
> +- const: txdb0
> +- const: txdb1
> +- const: rxdb0
> +- const: rxdb1
> +memory-region:
> +  maxItems: 1
> +  required:
> +- reg
> +- reg-names
> +- mboxes
> +- mbox-names
> +- memory-region
> +- port
> +else:
> +  properties:
> +reg: false
> +reg-names: false
> +mboxes:
> +  maxItems: 3
> +mbox-names:
> +  items:
> +- const: tx
> +- const: rx
> +- const: rxdb
> +port: false
> +
>  additionalProperties: false
>
>  examples:
> --
> 2.34.1
>

selftests/sched_ext: enq_last_no_enq_fails testcase fails

2024-10-23 Thread Vishal Chourasia

Hi,

= START =
TEST: enq_last_no_enq_fails
DESCRIPTION: Verify we fail to load a scheduler if we specify the 
SCX_OPS_ENQ_LAST flag without defining ops.enqueue()
OUTPUT:
ERR: enq_last_no_enq_fails.c:35
Incorrectly succeeded in to attaching scheduler
not ok 2 enq_last_no_enq_fails #
=  END  =


Above selftest fails even when BPF scheduler is not loaded into the kernel. 

Below is snippet from the dmesg verifing bpf program was not loaded:
sched_ext: enq_last_no_enq_fails: SCX_OPS_ENQ_LAST requires ops.enqueue() to be 
implemented
   scx_ops_enable.isra.0+0xde8/0xe30
   bpf_struct_ops_link_create+0x1ac/0x240
   link_create+0x178/0x400
   __sys_bpf+0x7ac/0xd50
   sys_bpf+0x2c/0x70
   system_call_exception+0x148/0x310
   system_call_vectored_common+0x15c/0x2ec
sched_ext: "enq_select_cpu_fails" does not implement cgroup cpu.weight
sched_ext: BPF scheduler "enq_select_cpu_fails" enabled
sched_ext: BPF scheduler "enq_select_cpu_fails" disabled (runtime error)



static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
{
...
 ret = validate_ops(ops);
  if (ret)
  goto err_disable;
...
  err_disable:
  mutex_unlock(&scx_ops_enable_mutex);
  /*
   * Returning an error code here would not pass all the error 
information
   * to userspace. Record errno using scx_ops_error() for cases
   * scx_ops_error() wasn't already invoked and exit indicating success 
so
   * that the error is notified through ops.exit() with all the details.
   *
   * Flush scx_ops_disable_work to ensure that error is reported before
   * init completion.
   */
  scx_ops_error("scx_ops_enable() failed (%d)", ret);
  kthread_flush_work(&scx_ops_disable_work);
  return 0;
  }

validate_ops() correctly reports the error, but err_disable path ultimately
returns with a value of zero

from: enq_last_no_enq_fails.c
static enum scx_test_status run(void *ctx)
{
struct enq_last_no_enq_fails *skel = ctx;
struct bpf_link *link;

link = bpf_map__attach_struct_ops(skel->maps.enq_last_no_enq_fails_ops);
if (link) {
SCX_ERR("Incorrectly succeeded in to attaching scheduler");
return SCX_TEST_FAIL;
}

bpf_link__destroy(link);

return SCX_TEST_PASS;
}

Re: [PATCH V13 03/14] KVM: x86: Fix Intel PT Host/Guest mode when host tracing also

2024-10-23 Thread Adrian Hunter

On 23/10/24 01:30, Sean Christopherson wrote:
> On Tue, Oct 22, 2024, Adrian Hunter wrote:
>> On 22/10/24 19:30, Sean Christopherson wrote:
> LOL, yeah, this needs to be burned with fire.  It's wildly broken.  So 
> for stable@,

 It doesn't seem wildly broken.  Just the VMM passing invalid CPUID
 and KVM not validating it.
>>>
>>> Heh, I agree with "just", but unfortunately "just ... not validating" a 
>>> large
>>> swath of userspace inputs is pretty widly broken.  More importantly, it's 
>>> not
>>> easy to fix.  E.g. KVM could require the inputs to exactly match hardware, 
>>> but
>>> that creates an ABI that I'm not entirely sure is desirable in the long 
>>> term.
>>
>> Although the CPUID ABI does not really change.  KVM does not support
>> emulating Intel PT, so accepting CPUID that the hardware cannot support
>> seems like a bit of a lie.
> 
> But it's not all or nothing, e.g. KVM should support exposing fewer address 
> ranges
> than are supported by hardware, so that the same virtual CPU model can be run 
> on
> different generations of hardware.
> 
>> Aren't there other features that KVM does not support if the hardware support
>> is not there?
> 
> Many.  But either features are one-off things without configurable properties,
> or KVM does the right thing (usually).  E.g. nested virtualization heavily 
> relies
> on hardware, and has a plethora of knobs, but KVM (usually) honors and 
> validates
> the configuration provided by userspace.
> 
>> To some degree, a testing and debugging feature does not have to be
>> available in 100% of cases because it can still be useful when it is
>> available.
> 
> I don't disagree, but "works on my machine" is how KVM has gotten into so many
> messes with such features.  I also don't necessarily disagree with supporting 
> a
> very limited subset of use cases, but I want such support to come as 
> well-defined
> package with proper guard rails, docs, and ideally tests.

Ok, so how about: leave VMM to choose CPUID, but then map it to what the
hardware actually supports for what is possible.  So the guest user might
not get trace data exactly as expected, or perhaps not at all, but at least
KVM doesn't die.  Then add documentation to explain how it all works.

Note, the number of address ranges is not that much of an issue because
currently all processors that support Intel PT virtualization have 2.

I have a feeling QEMU was targeting compatibility with IceLake, which
would probably work for all processors that support Intel PT virtualization
except for one feature - the maximum number of cycle thresholds (dropped
from 2048 to 16)

Re: [PATCH rcu] srcu: Guarantee non-negative return value from srcu_read_lock()

2024-10-23 Thread Paul E. McKenney

On Wed, Oct 23, 2024 at 02:58:07PM +0800, Alan Huang wrote:
> On Oct 22, 2024, at 22:26, Paul E. McKenney  wrote:
> > 
> > On Tue, Oct 22, 2024 at 12:13:12AM -0700, Christoph Hellwig wrote:
> >> On Tue, Oct 22, 2024 at 09:10:18AM +0200, Peter Zijlstra wrote:
> >>> Ah, well, the thing that got us here is that we (Andrii and me) wanted
> >>> to use -1 as an 'invalid' value to indicate SRCU is not currently in
> >>> use.
> >>> 
> >>> So it all being int is really rather convenient :-)
> >> 
> >> Then please document that use.  Maybe even with a symolic name for
> >> -1 that clearly describes these uses.
> > 
> > Would this work?
> > 
> > #define SRCU_INVALID_INDEX -1
> 
> Is there any similar guarantee of the return value of 
> get_state_synchronize_rcu
> or start_poll_synchronize_rcu, like invalid value?

Yes, there is a get_completed_synchronize_rcu() function that returns a
value that causes poll_state_synchronize_rcu() to always return true.
There is also a get_completed_synchronize_rcu_full() function that
returns a value that causes poll_state_synchronize_rcu_full() to always
return true.

There has been some discussion of another set of values that cause
poll_state_synchronize_rcu() and poll_state_synchronize_rcu_full() to
always return false, but there is not yet a use case for this.  Easy to
provide if required, but why further explode the RCU API unless it really
is required?

Thanx, Paul

> > Whatever the name, maybe Peter and Andrii define this under #ifndef
> > right now, and we get it into include/linux/srcu.h over time.
> > 
> > Or is there a better way?  Or name, for that matter.
> > 
> > Thanx, Paul
> > 
>

Re: [PATCH v4 13/19] gendwarfksyms: Add symbol versioning

2024-10-23 Thread Sami Tolvanen

Hi Petr,

On Tue, Oct 22, 2024 at 4:48 AM Petr Pavlu  wrote:
>
> I had some minor comment about adjusting the name of function
> symbol_print_versions() and possibly changing sym->name to 'char *' on
> the v2 of the patch:
> https://lore.kernel.org/all/286b1cc5-1757-4f0a-bb66-0875f4608...@suse.com/
> Please have a look, it seems it felt through the cracks.

Sorry, I missed that somehow. I can split this into two functions to
avoid confusion. Also I'm using const char * to make it obvious that
the name shouldn't be modified by users of struct symbol. This does
require a cast when freeing the string, which isn't ideal, but I feel
it's overall not a terrible trade-off.

Sami

Re: [PATCH net 2/3] mptcp: remove unneeded lock when listing scheds

2024-10-23 Thread Matthieu Baerts

Hi Simon,

Thank you for the reviews!

On 23/10/2024 14:21, Simon Horman wrote:
> On Mon, Oct 21, 2024 at 12:25:27PM +0200, Matthieu Baerts (NGI0) wrote:
>> mptcp_get_available_schedulers() needs to iterate over the schedulers'
>> list only to read the names: it doesn't modify anything there.
>>
>> In this case, it is enough to hold the RCU read lock, no need to combine
>> this with the associated spin lock.
>>
>> Fixes: 73c900aa3660 ("mptcp: add net.mptcp.available_schedulers")
>> Cc: sta...@vger.kernel.org
>> Suggested-by: Paolo Abeni 
>> Reviewed-by: Geliang Tang 
>> Signed-off-by: Matthieu Baerts (NGI0) 
> 
> I do wonder if it would be more appropriate to route this via net-next
> (without a fixes tag) rather than via net. But either way this looks good
> to me.
Good point. On one hand, I marked it as a fix, because when working on
the patch 1/3, we noticed these spin_(un)lock() were not supposed to be
there in the first place. On the other hand, even it's fixing a small
performance issue, it is not fixing a regression.

I think it is easier to route this via -net, but I'm fine if it is
applied in net-next.

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.

Re: [PATCH rcu] srcu: Guarantee non-negative return value from srcu_read_lock()

2024-10-23 Thread Alan Huang

On Oct 24, 2024, at 00:34, Andrii Nakryiko  wrote:
> 
> On Tue, Oct 22, 2024 at 11:40 PM Christoph Hellwig  wrote:
>> 
>> On Tue, Oct 22, 2024 at 10:29:13AM -0700, Andrii Nakryiko wrote:
 
 Would this work?
 
 #define SRCU_INVALID_INDEX -1
 
>>> 
>>> But why?
>> 
>> Becaue it very clearly documents what is going on.
> 
> So does saying "returned value is going to be non-negative, always".
> It's not some weird and unusual thing in C by any means.
> 
>> 
>>> It's a nice property to have an int-returning API where valid
>>> values are only >= 0, so callers are free to use the entire negative
>>> range (not just -1) for whatever they need to store in case there is
>>> no srcu_idx value.
>> 
>> Well, if you have a concrete use case for that we can probably live
>> with it, but I'd rather have that use case extremely well documented,
>> as it will be very puzzling to the reader.
>> 
> 
> See [0] for what Peter is proposing. Note hprobe_init() and its use of
> `srcu_idx < 0` as a mark that there is no SRCU "session" associated
> with the uprobe. Now, if we say that SRCU_INVALID_INDEX is the only
> invalid value, it leaves a question: what to do with other negative
> srcu_idx values? Are they valid? Should I now WARN() on "unknown
> invalid" values? Why all these complications? I'd rather just not have
> a unified hprobe_init() at that point, TBH.
> 
> But anyways, taking a step back. Take idr_alloc()/idr_alloc_cyclic()
> APIs as an example. They return int, but valid IDs are documented to
> be positive. This leaves users of this API free to use int to store ID
> in their data structures, knowing that <= 0 is "no ID", and if
> necessary, they can have multiple possible "no ID" situations.
> 
> Same principle here. Why prescribing a randomly picked -1 as the only
> "valid" invalid value, if anything negative is clearly impossible.
> 

A IS_INVALID_SRCU_INDEX macro would suffice, you need the condition anyway.

> 
>  [0] 
> https://lore.kernel.org/linux-trace-kernel/20241018101647.ga36...@noisy.programming.kicks-ass.net/

Re: [PATCH v1 04/10] iommufd/viommu: Allow drivers to control vdev_id lifecycle

2024-10-23 Thread Jason Gunthorpe

On Wed, Oct 23, 2024 at 12:22:15AM -0700, Nicolin Chen wrote:
> On Thu, Sep 05, 2024 at 03:01:19PM -0300, Jason Gunthorpe wrote:
> > On Tue, Aug 27, 2024 at 10:02:06AM -0700, Nicolin Chen wrote:
> > > The iommufd core provides a lookup helper for an IOMMU driver to find a
> > > device pointer by device's per-viommu virtual ID. Yet a driver may need
> > > an inverted lookup to find a device's per-viommu virtual ID by a device
> > > pointer, e.g. when reporting virtual IRQs/events back to the user space.
> > > In this case, it'd be unsafe for iommufd core to do an inverted lookup,
> > > as the driver can't track the lifecycle of a viommu object or a vdev_id
> > > object.
> > > 
> > > Meanwhile, some HW can even support virtual device ID lookup by its HW-
> > > accelerated virtualization feature. E.g. Tegra241 CMDQV HW supports to
> > > execute vanilla guest-issued SMMU commands containing virtual Stream ID
> > > but requires software to configure a link between virtual Stream ID and
> > > physical Stream ID via HW registers. So not only the iommufd core needs
> > > a vdev_id lookup table, drivers will want one too.
> > > 
> > > Given the two justifications above, it's the best practice to provide a
> > > a pair of set_vdev_id/unset_vdev_id ops in the viommu ops, so a driver
> > > can implement them to control a vdev_id's lifecycle, and configure the
> > > HW properly if required.
> > 
> > I think the lifecycle rules should be much simpler.
> > 
> > If a nested domain is attached to a STE/RID/device then the vIOMMU
> > affiliated with that nested domain is pinned while the STE is in place
> > 
> > So the driver only need to provide locking around attach changing the
> > STE's vIOMMU vs async operations translating from a STE to a
> > vIOMMU. This can be a simple driver lock of some kind, ie a rwlock
> > across the STE table.
> > 
> > Generally that is how all the async events should work, go from the
> > STE to the VIOMMU to a iommufd callback to the iommufd event
> > queue. iommufd will translate the struct device from the driver to an
> > idev_id (or maybe even a vdevid) the same basic way the PRI code works
> 
> I am trying to draft something following this, and here is what
> it would look like:
> 
> draft---
> struct arm_smmu_master {
>   
> + struct rw_semaphore lock;

It would be called vsmmu_lock

> + struct arm_vsmmu *vsmmu;
>   
> };
> 
> ->attach_dev() {
>   down_write(&master->lock);
>   if (domain->type == IOMMU_DOMAIN_NESTED)
>   master->vsmmu = to_smmu_domain(domain)->vsmmu;
>   else
>   master->vsmmu = NULL;
>   up_write(&master->lock);
> }
> 
> isr() {
>   down_read(&master->lock);
>   if (master->vsmmu) {
>   xa_lock(&master->vsmmu->core.vdevs);
>   vdev = iommufd_viommu_dev_to_vdev(&master->vsmmu->core,
> master->dev);
>   if (vdev) {
>   struct iommu_virq_arm_smmuv3 virq_data = evt;
> 
>   virq_data.evt[0] &= ~EVTQ_0_SID;
>   virq_data.evt[0] |= FIELD_PREP(EVTQ_0_SID, vdev->id);
>   return iommufd_viommu_report_irq(
>   vdev->viommu,
>   IOMMU_VIRQ_TYPE_ARM_SMMUV3, &virq_data,
>   sizeof(virq_data));
>   } else {
>   rc = -ENOENT;
>   }
>   xa_unlock(&master->vsmmu->core.vdevs);
>   }
>   up_read(&master->lock);
> }
> 

This looks reasonable

> [Comparison]  | This v1 | Draft
> 1. Adds to master | A lock and vdev ptr | A lock and viommu ptr
> 2. Set/unset ptr  | In ->vdevice_alloc/free | In all ->attach_dev
> 3. Do dev_to_vdev | master->vdev->id| attach_handle?

The set/unset ops have the major issue that they can get out of sync
with the domain. The only time things should be routed to the viommu
is when a viommu related domain is attached.

The lock on attach can be reduced:

  iommu_group_mutex_assert(dev)
  if (domain->type == IOMMU_DOMAIN_NESTED)
new_vsmmu = to_smmu_domain(domain)->vsmmu;
  else
new_vsmmu = NULL;
  if (new_vsmmu != master->vsmmu) {
down_write(&master->lock);
master->vsmmu = new_vsmmu;
up_write(&master->lock);
  }

And you'd stick this in prepare or commit..

> Both solutions needs a driver-level lock and an extra pointer in
> the master structure. And both ISR routines require that driver-
> level lock to avoid race against attach_dev v.s. vdev alloc/free.
> Overall, taking step.3 into consideration, it seems that letting
> master lock&hold the vdev pointer (i.e. this v1) is simpler?

I'm not sure the vdev pointer should even be visible to the drivers..

> As for the implementati

Re: [PATCH v4 2/4] pidfd: add PIDFD_SELF_* sentinels to refer to own thread/process

2024-10-23 Thread Lorenzo Stoakes

On Wed, Oct 23, 2024 at 10:18:28AM -0700, Shakeel Butt wrote:
> On Wed, Oct 23, 2024 at 08:18:35AM GMT, Lorenzo Stoakes wrote:
> > On Tue, Oct 22, 2024 at 05:53:00PM -0700, Shakeel Butt wrote:
> > > On Thu, Oct 17, 2024 at 10:05:50PM GMT, Lorenzo Stoakes wrote:
> > > > It is useful to be able to utilise the pidfd mechanism to reference the
> > > > current thread or process (from a userland point of view - thread group
> > > > leader from the kernel's point of view).
> > > >
> > > > Therefore introduce PIDFD_SELF_THREAD to refer to the current thread, 
> > > > and
> > > > PIDFD_SELF_THREAD_GROUP to refer to the current thread group leader.
> > > >
> > > > For convenience and to avoid confusion from userland's perspective we 
> > > > alias
> > > > these:
> > > >
> > > > * PIDFD_SELF is an alias for PIDFD_SELF_THREAD - This is nearly always 
> > > > what
> > > >   the user will want to use, as they would find it surprising if for
> > > >   instance fd's were unshared()'d and they wanted to invoke 
> > > > pidfd_getfd()
> > > >   and that failed.
> > > >
> > > > * PIDFD_SELF_PROCESS is an alias for PIDFD_SELF_THREAD_GROUP - Most 
> > > > users
> > > >   have no concept of thread groups or what a thread group leader is, and
> > > >   from userland's perspective and nomenclature this is what userland
> > > >   considers to be a process.
> > >
> > > Should users use PIDFD_SELF_PROCESS in process_madvise() for self
> > > madvise() (once the support is added)?
> >
> > You can use either it will make no difference as both will get you to
> > current->mm which has to be shared. So I'd go with PIDFD_SELF for brevity
> > :)
> >
> > This series and the prerequisites I already added to process_madvise()
> > already provide support so with this in you can just use this for that.
>
> Thanks a lot, this is awesome. Is the plan for this series to go through
> mm-tree or through Christian?
>

Thanks!

I am assuming this will go through Christian's tree as entirely within the
pidfd realm :)

Christian - I am assuming this is your expectation too right?

I cc'd linux-mm more for convenience as obviously am hoping to use this for
an mm thing myself.

Re: [PATCH 2/4] ASoC: dt-bindings: audio-graph-card2: add widgets and hp-det-gpios support

2024-10-23 Thread Kuninori Morimoto



Hi

> > From: Laurentiu Mihalcea 
> >
> > Introduce the 'widgets' property, allowing the creation of widgets from
> > 4 template widgets: Microphone, Line, Headphone, and Speaker. Also
> > introduce the 'hp-det-gpios' property, which allows using headphone
> > detection using the specified GPIO.
> >
> > Signed-off-by: Laurentiu Mihalcea 
> > ---
> >  .../devicetree/bindings/sound/audio-graph-card2.yaml | 9 +
> >  1 file changed, 9 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml 
> > b/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml
> > index f943f90d8b15..f0300a08f7fe 100644
> > --- a/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml
> > +++ b/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml
> > @@ -37,6 +37,15 @@ properties:
> >codec2codec:
> >  type: object
> >  description: Codec to Codec node
> > +  hp-det-gpios:
> > +maxItems: 1
> > +  widgets:
> > +description:
> > +  User specified audio sound widgets.
> > +  Each entry is a pair of strings, the first being the type of
> > +  widget ("Microphone", "Line", "Headphone", "Speaker"), the
> > +  second being the machine specific name for the widget.
> > +$ref: /schemas/types.yaml#/definitions/non-unique-string-array
> 
> Is it possible $ref: /schemas/sound/audio-graph.yaml to avoid duplicate
> these properties like audio-graph-card.yaml

You can find both "hp-det-gpios" and "widget" already exist in audio-graph.yaml

Thank you for your help !!

Best regards
---
Kuninori Morimoto

[PATCH 1/4] dt-bindings: remoteproc: fsl,imx-rproc: add new compatible

2024-10-23 Thread Laurentiu Mihalcea

From: Laurentiu Mihalcea 

Add new compatible for imx95's CM7 with SOF.

Signed-off-by: Laurentiu Mihalcea 
---
 .../bindings/remoteproc/fsl,imx-rproc.yaml| 58 +--
 1 file changed, 53 insertions(+), 5 deletions(-)

diff --git a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml 
b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
index 57d75acb0b5e..ab0d8e017965 100644
--- a/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
+++ b/Documentation/devicetree/bindings/remoteproc/fsl,imx-rproc.yaml
@@ -28,6 +28,15 @@ properties:
   - fsl,imx8qxp-cm4
   - fsl,imx8ulp-cm33
   - fsl,imx93-cm33
+  - fsl,imx95-cm7-sof
+
+  reg:
+maxItems: 2
+
+  reg-names:
+items:
+  - const: dram
+  - const: mailbox
 
   clocks:
 maxItems: 1
@@ -38,10 +47,8 @@ properties:
   Phandle to syscon block which provide access to System Reset Controller
 
   mbox-names:
-items:
-  - const: tx
-  - const: rx
-  - const: rxdb
+minItems: 1
+maxItems: 4
 
   mboxes:
 description:
@@ -49,7 +56,7 @@ properties:
   List of <&phandle type channel> - 1 channel for TX, 1 channel for RX, 1 
channel for RXDB.
   (see mailbox/fsl,mu.yaml)
 minItems: 1
-maxItems: 3
+maxItems: 4
 
   memory-region:
 description:
@@ -84,6 +91,10 @@ properties:
   This property is to specify the resource id of the remote processor in 
SoC
   which supports SCFW
 
+  port:
+$ref: /schemas/sound/audio-graph-port.yaml#
+unevaluatedProperties: false
+
 required:
   - compatible
 
@@ -114,6 +125,43 @@ allOf:
   properties:
 power-domains: false
 
+  - if:
+  properties:
+compatible:
+  contains:
+const: fsl,imx95-cm7-sof
+then:
+  properties:
+mboxes:
+  minItems: 4
+mbox-names:
+  items:
+- const: txdb0
+- const: txdb1
+- const: rxdb0
+- const: rxdb1
+memory-region:
+  maxItems: 1
+  required:
+- reg
+- reg-names
+- mboxes
+- mbox-names
+- memory-region
+- port
+else:
+  properties:
+reg: false
+reg-names: false
+mboxes:
+  maxItems: 3
+mbox-names:
+  items:
+- const: tx
+- const: rx
+- const: rxdb
+port: false
+
 additionalProperties: false
 
 examples:
-- 
2.34.1

Re: [PATCH rcu] srcu: Guarantee non-negative return value from srcu_read_lock()

2024-10-23 Thread Alan Huang

On Oct 24, 2024, at 00:40, Paul E. McKenney  wrote:
> 
> On Wed, Oct 23, 2024 at 02:58:07PM +0800, Alan Huang wrote:
>> On Oct 22, 2024, at 22:26, Paul E. McKenney  wrote:
>>> 
>>> On Tue, Oct 22, 2024 at 12:13:12AM -0700, Christoph Hellwig wrote:
 On Tue, Oct 22, 2024 at 09:10:18AM +0200, Peter Zijlstra wrote:
> Ah, well, the thing that got us here is that we (Andrii and me) wanted
> to use -1 as an 'invalid' value to indicate SRCU is not currently in
> use.
> 
> So it all being int is really rather convenient :-)
 
 Then please document that use.  Maybe even with a symolic name for
 -1 that clearly describes these uses.
>>> 
>>> Would this work?
>>> 
>>> #define SRCU_INVALID_INDEX -1
>> 
>> Is there any similar guarantee of the return value of 
>> get_state_synchronize_rcu
>> or start_poll_synchronize_rcu, like invalid value?
> 
> Yes, there is a get_completed_synchronize_rcu() function that returns a
> value that causes poll_state_synchronize_rcu() to always return true.
> There is also a get_completed_synchronize_rcu_full() function that
> returns a value that causes poll_state_synchronize_rcu_full() to always
> return true.

This is exactly the API I was searching for, didn’t read the doc thoroughly : )

Thanks!

> 
> There has been some discussion of another set of values that cause
> poll_state_synchronize_rcu() and poll_state_synchronize_rcu_full() to
> always return false, but there is not yet a use case for this.  Easy to
> provide if required, but why further explode the RCU API unless it really
> is required?
> 
> Thanx, Paul
> 
>>> Whatever the name, maybe Peter and Andrii define this under #ifndef
>>> right now, and we get it into include/linux/srcu.h over time.
>>> 
>>> Or is there a better way?  Or name, for that matter.
>>> 
>>> Thanx, Paul
>>> 
>>

Re: [PATCH v2 2/2] soc: qcom: pmic_glink: Handle GLINK intent allocation rejections

2024-10-23 Thread Chris Lew





On 10/23/2024 10:24 AM, Bjorn Andersson wrote:

Some versions of the pmic_glink firmware does not allow dynamic GLINK
intent allocations, attempting to send a message before the firmware has
allocated its receive buffers and announced these intent allocations
will fail. When this happens something like this showns up in the log:

 pmic_glink_altmode.pmic_glink_altmode pmic_glink.altmode.0: failed to send 
altmode request: 0x10 (-125)
 pmic_glink_altmode.pmic_glink_altmode pmic_glink.altmode.0: failed to 
request altmode notifications: -125
 ucsi_glink.pmic_glink_ucsi pmic_glink.ucsi.0: failed to send UCSI read 
request: -125
 qcom_battmgr.pmic_glink_power_supply pmic_glink.power-supply.0: failed to 
request power notifications

GLINK has been updated to distinguish between the cases where the remote
is going down (-ECANCELED) and the intent allocation being rejected
(-EAGAIN).

Retry the send until intent buffers becomes available, or an actual
error occur.

To avoid infinitely waiting for the firmware in the event that this
misbehaves and no intents arrive, an arbitrary 5 second timeout is
used.

This patch was developed with input from Chris Lew.

Reported-by: Johan Hovold 
Closes: https://lore.kernel.org/all/zqet8iinndhnx...@hovoldconsulting.com/#t
Cc: sta...@vger.kernel.org # rpmsg: glink: Handle rejected intent request better
Fixes: 58ef4ece1e41 ("soc: qcom: pmic_glink: Introduce base PMIC GLINK driver")
Tested-by: Johan Hovold 
Reviewed-by: Johan Hovold 
Signed-off-by: Bjorn Andersson 
---


Reviewed-by: Chris Lew

Re: [PATCH v2 1/2] rpmsg: glink: Handle rejected intent request better

2024-10-23 Thread Chris Lew





On 10/23/2024 10:24 AM, Bjorn Andersson wrote:

GLINK operates using pre-allocated buffers, aka intents, where incoming
messages are aggregated before being passed up the stack. In the case
that no suitable intents have been announced by the receiver, the sender
can request an intent to be allocated.

The initial implementation of the response to such request dealt
with two outcomes; granted allocations, and all other cases being
considered -ECANCELLED (likely from "cancelling the operation as the
remote is going down").

But on some channels intent allocation is not supported, instead the
remote will pre-allocate and announce a fixed number of intents for the
sender to use. If for such channels an rpmsg_send() is being invoked
before any channels have been announced, an intent request will be
issued and as this comes back rejected the call fails with -ECANCELED.

Given that this is reported in the same way as the remote being shut
down, there's no way for the client to differentiate the two cases.

In line with the original GLINK design, change the return value to
-EAGAIN for the case where the remote rejects an intent allocation
request.

It's tempting to handle this case in the GLINK core, as we expect
intents to show up in this case. But there's no way to distinguish
between this case and a rejection for a too big allocation, nor is it
possible to predict if a currently used (and seemingly suitable) intent
will be returned for reuse or not. As such, returning the error to the
client and allow it to react seems to be the only sensible solution.

In addition to this, commit 'c05dfce0b89e ("rpmsg: glink: Wait for
intent, not just request ack")' changed the logic such that the code
always wait for an intent request response and an intent. This works out
in most cases, but in the event that an intent request is rejected and no
further intent arrives (e.g. client asks for a too big intent), the code
will stall for 10 seconds and then return -ETIMEDOUT; instead of a more
suitable error.

This change also resulted in intent requests racing with the shutdown of
the remote would be exposed to this same problem, unless some intent
happens to arrive. A patch for this was developed and posted by Sarannya
S [1], and has been incorporated here.

To summarize, the intent request can end in 4 ways:
- Timeout, no response arrived => return -ETIMEDOUT
- Abort TX, the edge is going away => return -ECANCELLED
- Intent request was rejected => return -EAGAIN
- Intent request was accepted, and an intent arrived => return 0

This patch was developed with input from Sarannya S, Deepak Kumar Singh,
and Chris Lew.

[1] 
https://lore.kernel.org/all/20240925072328.1163183-1-quic_dee...@quicinc.com/

Fixes: c05dfce0b89e ("rpmsg: glink: Wait for intent, not just request ack")
Cc: sta...@vger.kernel.org
Tested-by: Johan Hovold 
Signed-off-by: Bjorn Andersson 
---


Reviewed-by: Chris Lew

Re: [PATCH 2/4] ASoC: dt-bindings: audio-graph-card2: add widgets and hp-det-gpios support

2024-10-23 Thread Frank Li

On Wed, Oct 23, 2024 at 12:21:12PM -0400, Laurentiu Mihalcea wrote:
> From: Laurentiu Mihalcea 
>
> Introduce the 'widgets' property, allowing the creation of widgets from
> 4 template widgets: Microphone, Line, Headphone, and Speaker. Also
> introduce the 'hp-det-gpios' property, which allows using headphone
> detection using the specified GPIO.
>
> Signed-off-by: Laurentiu Mihalcea 
> ---
>  .../devicetree/bindings/sound/audio-graph-card2.yaml | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml 
> b/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml
> index f943f90d8b15..f0300a08f7fe 100644
> --- a/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml
> +++ b/Documentation/devicetree/bindings/sound/audio-graph-card2.yaml
> @@ -37,6 +37,15 @@ properties:
>codec2codec:
>  type: object
>  description: Codec to Codec node
> +  hp-det-gpios:
> +maxItems: 1
> +  widgets:
> +description:
> +  User specified audio sound widgets.
> +  Each entry is a pair of strings, the first being the type of
> +  widget ("Microphone", "Line", "Headphone", "Speaker"), the
> +  second being the machine specific name for the widget.
> +$ref: /schemas/types.yaml#/definitions/non-unique-string-array

Is it possible $ref: /schemas/sound/audio-graph.yaml to avoid duplicate
these properties like audio-graph-card.yaml

Frank

>
>  required:
>- compatible
> --
> 2.34.1
>

Re: selftests/sched_ext: enq_last_no_enq_fails testcase fails

2024-10-23 Thread Tejun Heo

Hello,

On Wed, Oct 23, 2024 at 10:13:19PM +0530, Vishal Chourasia wrote:
...
> static int scx_ops_enable(struct sched_ext_ops *ops, struct bpf_link *link)
> {
> ...
>  ret = validate_ops(ops);
>   if (ret)
>   goto err_disable;
> ...
>   err_disable:
>   mutex_unlock(&scx_ops_enable_mutex);
>   /*
>* Returning an error code here would not pass all the error 
> information
>* to userspace. Record errno using scx_ops_error() for cases
>* scx_ops_error() wasn't already invoked and exit indicating 
> success so
>* that the error is notified through ops.exit() with all the 
> details.
>*
>* Flush scx_ops_disable_work to ensure that error is reported 
> before
>* init completion.
>*/
>   scx_ops_error("scx_ops_enable() failed (%d)", ret);
>   kthread_flush_work(&scx_ops_disable_work);
>   return 0;
>   }
> 
> validate_ops() correctly reports the error, but err_disable path ultimately
> returns with a value of zero

Yeah, this is because the failure is now communicated through the scheduler
unload path which has richer error reporting. The exit is triggered
immediately but loading still succeeds. We need to update the test framework
to detect this failure mode too.

Thanks.

-- 
tejun

[PATCH 2/6] kselftest/arm64: Remove unused ADRs from irritator handlers

2024-10-23 Thread Mark Brown

The irritator handlers for the fp-stress test programs all use ADR to load
an address into x0 which is then not referenced. Remove these ADRs as they
just cause confusion.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/arm64/fp/fpsimd-test.S | 1 -
 tools/testing/selftests/arm64/fp/sve-test.S| 1 -
 tools/testing/selftests/arm64/fp/za-test.S | 1 -
 tools/testing/selftests/arm64/fp/zt-test.S | 1 -
 4 files changed, 4 deletions(-)

diff --git a/tools/testing/selftests/arm64/fp/fpsimd-test.S 
b/tools/testing/selftests/arm64/fp/fpsimd-test.S
index 
bdfb7cf2e4ec175fda62c1c2f38c6ebb1a1c48bf..9977ffdd758a51a7af67cd607d019a6c54d3a6c6
 100644
--- a/tools/testing/selftests/arm64/fp/fpsimd-test.S
+++ b/tools/testing/selftests/arm64/fp/fpsimd-test.S
@@ -142,7 +142,6 @@ function irritator_handler
str x0, [x2, #ucontext_regs + 8 * 23]
 
// Corrupt some random V-regs
-   adr x0, .text + (irritator_handler - .text) / 16 * 16
moviv0.8b, #7
moviv9.16b, #9
moviv31.8b, #31
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S 
b/tools/testing/selftests/arm64/fp/sve-test.S
index 
e3c0d585684df29723a49265f3df6d23817498c7..f1fb9745c681786f686f1fafcb7e1154f3c8e1a3
 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -299,7 +299,6 @@ function irritator_handler
str x0, [x2, #ucontext_regs + 8 * 23]
 
// Corrupt some random Z-regs
-   adr x0, .text + (irritator_handler - .text) / 16 * 16
moviv0.8b, #1
moviv9.16b, #2
moviv31.8b, #3
diff --git a/tools/testing/selftests/arm64/fp/za-test.S 
b/tools/testing/selftests/arm64/fp/za-test.S
index 
095b45531640966e685408057c08ada67e68998b..1ee0ec36766d2bef92aff50a002813e76e22963c
 100644
--- a/tools/testing/selftests/arm64/fp/za-test.S
+++ b/tools/testing/selftests/arm64/fp/za-test.S
@@ -158,7 +158,6 @@ function irritator_handler
 
// Corrupt some random ZA data
 #if 0
-   adr x0, .text + (irritator_handler - .text) / 16 * 16
moviv0.8b, #1
moviv9.16b, #2
moviv31.8b, #3
diff --git a/tools/testing/selftests/arm64/fp/zt-test.S 
b/tools/testing/selftests/arm64/fp/zt-test.S
index 
b5c81e81a37946c1bffe810568855939e9ceb08e..ade9c98abcdafc2755ef4796670566d99e919e5c
 100644
--- a/tools/testing/selftests/arm64/fp/zt-test.S
+++ b/tools/testing/selftests/arm64/fp/zt-test.S
@@ -127,7 +127,6 @@ function irritator_handler
 
// Corrupt some random ZT data
 #if 0
-   adr x0, .text + (irritator_handler - .text) / 16 * 16
moviv0.8b, #1
moviv9.16b, #2
moviv31.8b, #3

-- 
2.39.2

[PATCH 0/6] kselftest/arm64: Test floating point signal context restore in fp-stress

2024-10-23 Thread Mark Brown

Currently we test signal delivery to the programs run by fp-stress but
our signal handlers simply count the number of signals seen and don't do
anything with the floating point state.  The original fpsimd-test and
sve-test programs had signal handlers called irritators which modify the
live register state, verifying that we restore the signal context on
return, but a combination of misleading comments and code resulted in
them never being used and the equivalent handlers in the other tests
being stubbed or omitted.

Clarify the code, implement effective irritator handlers for the test
programs that can have them and then switch the signals generated by the
fp-stress program over to use the irritators, ensuring that we validate
that we restore the saved signal context properly.

Signed-off-by: Mark Brown 
---
Mark Brown (6):
  kselftest/arm64: Correct misleading comments on fp-stress irritators
  kselftest/arm64: Remove unused ADRs from irritator handlers
  kselftest/arm64: Corrupt P15 in the irritator when testing SSVE
  kselftest/arm64: Implement irritators for ZA and ZT
  kselftest/arm64: Provide a SIGUSR1 handler in the kernel mode FP stress 
test
  kselftest/arm64: Test signal handler state modification in fp-stress

 tools/testing/selftests/arm64/fp/fp-stress.c   |  2 +-
 tools/testing/selftests/arm64/fp/fpsimd-test.S |  4 +---
 tools/testing/selftests/arm64/fp/kernel-test.c |  4 
 tools/testing/selftests/arm64/fp/sve-test.S|  6 ++
 tools/testing/selftests/arm64/fp/za-test.S | 13 -
 tools/testing/selftests/arm64/fp/zt-test.S | 13 -
 6 files changed, 16 insertions(+), 26 deletions(-)
---
base-commit: 8e929cb546ee42c9a61d24fae60605e9e3192354
change-id: 20241023-arm64-fp-stress-irritator-5415fe92adef

Best regards,
-- 
Mark Brown

[PATCH 6/6] kselftest/arm64: Test signal handler state modification in fp-stress

2024-10-23 Thread Mark Brown

Currently in fp-stress we test signal delivery to the test threads by
sending SIGUSR2 which simply counts how many signals are delivered. The
test programs now also all have a SIGUSR1 handler which for the threads
doing userspace testing additionally modifies the floating point register
state in the signal handler, verifying that when we return the saved
register state is restored from the signal context as expected. Switch over
to triggering that to validate that we are restoring as expected.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/arm64/fp/fp-stress.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/arm64/fp/fp-stress.c 
b/tools/testing/selftests/arm64/fp/fp-stress.c
index 
faac24bdefeb9436e2daf20b7250d0ae25ca23a7..3d477249dee0632b662b48582433d39323d18e18
 100644
--- a/tools/testing/selftests/arm64/fp/fp-stress.c
+++ b/tools/testing/selftests/arm64/fp/fp-stress.c
@@ -221,7 +221,7 @@ static void child_output(struct child_data *child, uint32_t 
events,
 static void child_tickle(struct child_data *child)
 {
if (child->output_seen && !child->exited)
-   kill(child->pid, SIGUSR2);
+   kill(child->pid, SIGUSR1);
 }
 
 static void child_stop(struct child_data *child)

-- 
2.39.2

[PATCH 5/6] kselftest/arm64: Provide a SIGUSR1 handler in the kernel mode FP stress test

2024-10-23 Thread Mark Brown

The other stress test programs provide a SIGUSR1 handler which modifies the
live register state in order to validate that signal context is being
restored during signal return. While we can't usefully do this when testing
kernel mode FP usage provide a handler for SIGUSR1 which just counts the
number of signals like we do for SIGUSR2, allowing fp-stress to treat all
the test programs uniformly.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/arm64/fp/kernel-test.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/tools/testing/selftests/arm64/fp/kernel-test.c 
b/tools/testing/selftests/arm64/fp/kernel-test.c
index 
e8da3b4cbd23202c6504ffd8043f8ef351d739f6..859345379044fc287458644309d66cf5f3d8bdf5
 100644
--- a/tools/testing/selftests/arm64/fp/kernel-test.c
+++ b/tools/testing/selftests/arm64/fp/kernel-test.c
@@ -267,6 +267,10 @@ int main(void)
   strerror(errno), errno);
 
sa.sa_sigaction = handle_kick_signal;
+   ret = sigaction(SIGUSR1, &sa, NULL);
+   if (ret < 0)
+   printf("Failed to install SIGUSR1 handler: %s (%d)\n",
+  strerror(errno), errno);
ret = sigaction(SIGUSR2, &sa, NULL);
if (ret < 0)
printf("Failed to install SIGUSR2 handler: %s (%d)\n",

-- 
2.39.2

[PATCH 3/6] kselftest/arm64: Corrupt P15 in the irritator when testing SSVE

2024-10-23 Thread Mark Brown

When building for streaming SVE the irritator for SVE skips updates of both
P15 and FFR. While FFR is skipped since it might not be present there is no
reason to skip corrupting P15 so move the ifdef appropriately.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/arm64/fp/sve-test.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/arm64/fp/sve-test.S 
b/tools/testing/selftests/arm64/fp/sve-test.S
index 
f1fb9745c681786f686f1fafcb7e1154f3c8e1a3..3c88dfe9c8cad29f44217314aeaffa984bac05e5
 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -302,9 +302,9 @@ function irritator_handler
moviv0.8b, #1
moviv9.16b, #2
moviv31.8b, #3
-#ifndef SSVE
// And P0
rdffr   p0.b
+#ifndef SSVE
// And FFR
wrffr   p15.b
 #endif

-- 
2.39.2

[PATCH 4/6] kselftest/arm64: Implement irritators for ZA and ZT

2024-10-23 Thread Mark Brown

Currently we don't use the irritator signal in our floating point stress
tests so when we added ZA and ZT stress tests we didn't actually bother
implementing any actual action in the handlers, we just counted the signal
deliveries. In preparation for using the irritators let's implement them,
just trivially SMSTOP and SMSTART to reset all bits in the register to 0.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/arm64/fp/za-test.S | 12 
 tools/testing/selftests/arm64/fp/zt-test.S | 12 
 2 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/tools/testing/selftests/arm64/fp/za-test.S 
b/tools/testing/selftests/arm64/fp/za-test.S
index 
1ee0ec36766d2bef92aff50a002813e76e22963c..f902e6ef9077bfa34fa7f85ce572ce3df4346789
 100644
--- a/tools/testing/selftests/arm64/fp/za-test.S
+++ b/tools/testing/selftests/arm64/fp/za-test.S
@@ -148,20 +148,16 @@ function check_za
b   memcmp
 endfunction
 
-// Any SME register modified here can cause corruption in the main
-// thread -- but *only* the locations modified here.
+// Modify the live SME register state, signal return will undo our changes
 function irritator_handler
// Increment the irritation signal count (x23):
ldr x0, [x2, #ucontext_regs + 8 * 23]
add x0, x0, #1
str x0, [x2, #ucontext_regs + 8 * 23]
 
-   // Corrupt some random ZA data
-#if 0
-   moviv0.8b, #1
-   moviv9.16b, #2
-   moviv31.8b, #3
-#endif
+   // This will reset ZA to all bits 0
+   smstop
+   smstart
 
ret
 endfunction
diff --git a/tools/testing/selftests/arm64/fp/zt-test.S 
b/tools/testing/selftests/arm64/fp/zt-test.S
index 
ade9c98abcdafc2755ef4796670566d99e919e5c..c96cb7c2ad4b406c54099fe3f73917259bd947e4
 100644
--- a/tools/testing/selftests/arm64/fp/zt-test.S
+++ b/tools/testing/selftests/arm64/fp/zt-test.S
@@ -117,20 +117,16 @@ function check_zt
b   memcmp
 endfunction
 
-// Any SME register modified here can cause corruption in the main
-// thread -- but *only* the locations modified here.
+// Modify the live SME register state, signal return will undo our changes
 function irritator_handler
// Increment the irritation signal count (x23):
ldr x0, [x2, #ucontext_regs + 8 * 23]
add x0, x0, #1
str x0, [x2, #ucontext_regs + 8 * 23]
 
-   // Corrupt some random ZT data
-#if 0
-   moviv0.8b, #1
-   moviv9.16b, #2
-   moviv31.8b, #3
-#endif
+   // This will reset ZT to all bits 0
+   smstop
+   smstart
 
ret
 endfunction

-- 
2.39.2

Re: [PATCH v1 04/10] iommufd/viommu: Allow drivers to control vdev_id lifecycle

2024-10-23 Thread Nicolin Chen

On Wed, Oct 23, 2024 at 01:59:05PM -0300, Jason Gunthorpe wrote:
> > [Comparison]  | This v1 | Draft
> > 1. Adds to master | A lock and vdev ptr | A lock and viommu ptr
> > 2. Set/unset ptr  | In ->vdevice_alloc/free | In all ->attach_dev
> > 3. Do dev_to_vdev | master->vdev->id| attach_handle?
> 
> The set/unset ops have the major issue that they can get out of sync
> with the domain. The only time things should be routed to the viommu
> is when a viommu related domain is attached.

Ah, I missed that point.

> The lock on attach can be reduced:
> 
>   iommu_group_mutex_assert(dev)
>   if (domain->type == IOMMU_DOMAIN_NESTED)
>   new_vsmmu = to_smmu_domain(domain)->vsmmu;
>   else
>   new_vsmmu = NULL;
>   if (new_vsmmu != master->vsmmu) {
>   down_write(&master->lock);
>   master->vsmmu = new_vsmmu;
>   up_write(&master->lock);
>   }
> 
> And you'd stick this in prepare or commit..

Ack.

> > Both solutions needs a driver-level lock and an extra pointer in
> > the master structure. And both ISR routines require that driver-
> > level lock to avoid race against attach_dev v.s. vdev alloc/free.
> > Overall, taking step.3 into consideration, it seems that letting
> > master lock&hold the vdev pointer (i.e. this v1) is simpler?
> 
> I'm not sure the vdev pointer should even be visible to the drivers..

With the iommufd_vdevice_alloc allocator, we already expose the
vdevice structure to the drivers, along with the vdevice_alloc
vdevice_free ops, which would be easier for the vCMDQ driver to
allocate and hold its own pSID/vSID structure.

And vsid_to_psid() requires to look up the viommu->vdevs xarray.
If we hid the vDEVICE structure, we'd need something else than
the vdev_to_dev(). Maybe iommufd_viommu_find_dev_by_virt_id()?

> > As for the implementation of iommufd_viommu_dev_to_vdev(), I read
> > the attach_handle part in the PRI code, yet I don't see the lock
> > that protects the handle returned by iommu_attach_handle_get() in
> > iommu_report_device_fault/find_fault_handler().
> 
> It is the xa_lock and some rules about flushing irqs and work queues
> before allowing the dev to disappear:
> 
> >   "Callers are required to synchronize the call of
> >iommu_attach_handle_get() with domain attachment
> >and detachment. The attach handle can only be used
> >during its life cycle."
> 
> > But the caller iommu_report_device_fault() is an async event that
> > cannot guarantee the lifecycle. Would you please shed some light?
> 
> The iopf detatch function will act as a barrirer to ensure that all
> the async work has completed, sort of like how RCU works.

The xa_lock(&group->pasid_array) is released once the handle is
returned to the iommu_attach_handle_get callers, so it protects
only for a very short window (T0 below). What if:
   | detach()   | isr=>iommu_report_device_fault()
T0 | Get attach_handle [xa_lock]| Get attach_handle [xa_lock]
T1 | Clean deliver Q [fault->mutex] | Waiting for fault->mutex
T2 | iommufd_eventq_iopf_disable()  | Add new fault to the deliver Q
T3 | kfree(handle)  | ?? 

> But here, I think it is pretty simple, isn't it?
> 
> When you update the master->vsmmu you can query the vsmmu to get the
> vdev id of that master, then store it in the master struct and forward
> it to the iommufd_viommu_report_irq(). That could even search the
> xarray since attach is not a performance path.
> 
> Then it is locked under the master->lock

Yes! I didn't see that coming. vdev->id must be set before the
attach to a nested domain, and can be cleaned after the device
detaches. Maybe an attach to vIOMMU-based nested domain should
just fail if idev->vdev isn't ready?

Then perhaps we can have a struct arm_smmu_vstream to hold all
the things, such as vsmmu pointer and vdev->id. If vCMDQ needs
an extra structure, it can be stored there too.

Thanks!
Nicolin

Re: [PATCH 3/4] ASoC: SOF: imx: add driver for imx95

2024-10-23 Thread Frank Li

On Wed, Oct 23, 2024 at 12:21:13PM -0400, Laurentiu Mihalcea wrote:
> From: Laurentiu Mihalcea 
>
> Add SOF driver for imx95.
>
> Signed-off-by: Laurentiu Mihalcea 
> ---
>  sound/soc/sof/imx/Kconfig  |   8 +
>  sound/soc/sof/imx/Makefile |   2 +
>  sound/soc/sof/imx/imx95.c  | 383 +
>  3 files changed, 393 insertions(+)
>  create mode 100644 sound/soc/sof/imx/imx95.c
>
> diff --git a/sound/soc/sof/imx/Kconfig b/sound/soc/sof/imx/Kconfig
> index 4751b04d5e6f..51a70a193533 100644
> --- a/sound/soc/sof/imx/Kconfig
> +++ b/sound/soc/sof/imx/Kconfig
> @@ -50,4 +50,12 @@ config SND_SOC_SOF_IMX8ULP
> Say Y if you have such a device.
> If unsure select "N".
>
> +config SND_SOC_SOF_IMX95
> + tristate "SOF support for i.MX95"
> + depends on IMX_DSP
> + help
> +   This adds support for Sound Open Firmware for NXP i.MX95 platforms.
> +   Say Y if you have such a device.
> +   If unsure select "N".
> +
>  endif ## SND_SOC_SOF_IMX_TOPLEVEL
> diff --git a/sound/soc/sof/imx/Makefile b/sound/soc/sof/imx/Makefile
> index be0bf0736dfa..715ac3798668 100644
> --- a/sound/soc/sof/imx/Makefile
> +++ b/sound/soc/sof/imx/Makefile
> @@ -2,10 +2,12 @@
>  snd-sof-imx8-y := imx8.o
>  snd-sof-imx8m-y := imx8m.o
>  snd-sof-imx8ulp-y := imx8ulp.o
> +snd-sof-imx95-y := imx95.o
>
>  snd-sof-imx-common-y := imx-common.o
>
>  obj-$(CONFIG_SND_SOC_SOF_IMX8) += snd-sof-imx8.o
>  obj-$(CONFIG_SND_SOC_SOF_IMX8M) += snd-sof-imx8m.o
>  obj-$(CONFIG_SND_SOC_SOF_IMX8ULP) += snd-sof-imx8ulp.o
> +obj-$(CONFIG_SND_SOC_SOF_IMX95) += snd-sof-imx95.o
>  obj-$(CONFIG_SND_SOC_SOF_IMX_COMMON) += imx-common.o
> diff --git a/sound/soc/sof/imx/imx95.c b/sound/soc/sof/imx/imx95.c
> new file mode 100644
> index ..3f7ed6a16c42
> --- /dev/null
> +++ b/sound/soc/sof/imx/imx95.c
> @@ -0,0 +1,383 @@
> +// SPDX-License-Identifier: (GPL-2.0-only OR BSD-3-Clause)
> +/*
> + * Copyright 2024 NXP
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

sort by alphabet order

> +
> +#include "../sof-of-dev.h"
> +#include "../ops.h"
> +
> +#define IMX_SIP_SRC 0xC205
> +#define IMX_SIP_SRC_M_RESET_ADDR_SET 0x03
> +
> +#define IMX95_CPU_VEC_FLAGS_BOOT BIT(29)
> +
> +#define IMX_SIP_LMM 0xC20F
> +#define IMX_SIP_LMM_BOOT 0x0
> +#define IMX_SIP_LMM_SHUTDOWN 0x1
> +
> +#define IMX95_M7_LM_ID 0x1
> +
> +#define MBOX_DSPBOX_OFFSET 0x1000
> +
> +struct imx95_priv {
> + struct platform_device *ipc_dev;
> + struct imx_dsp_ipc *ipc_handle;
> + resource_size_t bootaddr;
> +};
> +
> +static void imx95_ipc_handle_reply(struct imx_dsp_ipc *ipc)
> +{
> + unsigned long flags;
> + struct snd_sof_dev *sdev = imx_dsp_get_data(ipc);
> +
> + spin_lock_irqsave(&sdev->ipc_lock, flags);
> + snd_sof_ipc_process_reply(sdev, 0);
> + spin_unlock_irqrestore(&sdev->ipc_lock, flags);
> +}
> +
> +static void imx95_ipc_handle_request(struct imx_dsp_ipc *ipc)
> +{
> + snd_sof_ipc_msgs_rx(imx_dsp_get_data(ipc));
> +}
> +
> +static struct imx_dsp_ops ipc_ops = {
> + .handle_reply = imx95_ipc_handle_reply,
> + .handle_request = imx95_ipc_handle_request,
> +};
> +
> +static int imx95_disable_enable_core(bool enable)
> +{
> + struct arm_smccc_res res;
> +
> + if (enable)
> + arm_smccc_smc(IMX_SIP_LMM, IMX_SIP_LMM_BOOT, IMX95_M7_LM_ID,
> +   0, 0, 0, 0, 0, &res);
> + else
> + arm_smccc_smc(IMX_SIP_LMM, IMX_SIP_LMM_SHUTDOWN, IMX95_M7_LM_ID,
> +   0, 0, 0, 0, 0, &res);
> +
> + return res.a0;
> +}
> +
> +static int imx95_run(struct snd_sof_dev *sdev)
> +{
> + return imx95_disable_enable_core(true);
> +}
> +
> +static int imx95_send_msg(struct snd_sof_dev *sdev, struct snd_sof_ipc_msg 
> *msg)
> +{
> + struct imx95_priv *priv = sdev->pdata->hw_pdata;
> +
> + sof_mailbox_write(sdev, sdev->host_box.offset,
> +   msg->msg_data, msg->msg_size);
> +
> + imx_dsp_ring_doorbell(priv->ipc_handle, 0);
> +
> + return 0;
> +}
> +
> +static int imx95_get_mailbox_offset(struct snd_sof_dev *sdev)
> +{
> + return MBOX_DSPBOX_OFFSET;
> +}
> +
> +static int imx95_get_bar_index(struct snd_sof_dev *sdev, u32 type)
> +{
> + switch (type) {
> + case SOF_FW_BLK_TYPE_SRAM:
> + case SOF_FW_BLK_TYPE_DRAM:
> + return type;
> + default:
> + return -EINVAL;
> + }
> +}
> +
> +static int imx95_get_window_offset(struct snd_sof_dev *sdev, u32 id)
> +{
> + /* no offset for window regions - they are already relative
> +  * to the mailbox memory region described in DT.
> +  */
> + return 0;
> +}
> +
> +static int imx95_set_power_state(struct snd_sof_dev *sdev,
> +  const struct sof_dsp_power_state *target_state)
> +{
> + sdev->dsp_power_state = *target_state;
> +
> + return 0;
> +}
> +
> +/* no other resources to power on during (runtime) resume s

Re: [PATCH v2 1/2] selftest: rtc: Add to check rtc alarm status for alarm related test

2024-10-23 Thread Shuah Khan


On 10/22/24 10:01, Alexandre Belloni wrote:

On 20/10/2024 20:22:13-0700, Joseph Jang wrote:

In alarm_wkalm_set and alarm_wkalm_set_minute test, they use different
ioctl (RTC_ALM_SET/RTC_WKALM_SET) for alarm feature detection. They will
skip testing if RTC_ALM_SET/RTC_WKALM_SET ioctl returns an EINVAL error
code. This design may miss detecting real problems when the
efi.set_wakeup_time() return errors and then RTC_ALM_SET/RTC_WKALM_SET
ioctl returns an EINVAL error code with RTC_FEATURE_ALARM enabled.

In order to make rtctest more explicit and robust, we propose to use
RTC_PARAM_GET ioctl interface to check rtc alarm feature state before
running alarm related tests. If the kernel does not support RTC_PARAM_GET
ioctl interface, we will fallback to check the error number of
(RTC_ALM_SET/RTC_WKALM_SET) ioctl call for alarm feature detection.

Requires commit 101ca8d05913b ("rtc: efi: Enable SET/GET WAKEUP services
as optional")

Reviewed-by: Koba Ko 
Reviewed-by: Matthew R. Ochs 
Signed-off-by: Joseph Jang 


Acked-by: Alexandre Belloni 


---
Changes in v2:
- Changed to use $(top_srcdir) instead of hardcoding the path.



Thanks.

Applied to linux-kselftest next for Linux 6.13-rc1

thanks,
-- Shuah

Re: [PATCH 4/4] arm64: dts: imx: add imx95 dts for sof

2024-10-23 Thread Frank Li

On Wed, Oct 23, 2024 at 12:21:14PM -0400, Laurentiu Mihalcea wrote:
> From: Laurentiu Mihalcea 
>
> Add imx95 DTS for SOF usage.
>
> Signed-off-by: Laurentiu Mihalcea 
> ---
>  arch/arm64/boot/dts/freescale/Makefile|  1 +
>  .../dts/freescale/imx95-19x19-evk-sof.dts | 86 +++
>  2 files changed, 87 insertions(+)
>  create mode 100644 arch/arm64/boot/dts/freescale/imx95-19x19-evk-sof.dts
>
> diff --git a/arch/arm64/boot/dts/freescale/Makefile 
> b/arch/arm64/boot/dts/freescale/Makefile
> index 2a69b7ec6d6d..94660e3e8b2b 100644
> --- a/arch/arm64/boot/dts/freescale/Makefile
> +++ b/arch/arm64/boot/dts/freescale/Makefile
> @@ -267,6 +267,7 @@ dtb-$(CONFIG_ARCH_MXC) += imx93-tqma9352-mba93xxca.dtb
>  dtb-$(CONFIG_ARCH_MXC) += imx93-tqma9352-mba93xxla.dtb
>  dtb-$(CONFIG_ARCH_MXC) += imx93-var-som-symphony.dtb
>  dtb-$(CONFIG_ARCH_MXC) += imx95-19x19-evk.dtb
> +dtb-$(CONFIG_ARCH_MXC) += imx95-19x19-evk-sof.dtb
>
>  imx8mm-kontron-dl-dtbs   := imx8mm-kontron-bl.dtb 
> imx8mm-kontron-dl.dtbo
>
> diff --git a/arch/arm64/boot/dts/freescale/imx95-19x19-evk-sof.dts 
> b/arch/arm64/boot/dts/freescale/imx95-19x19-evk-sof.dts
> new file mode 100644
> index ..b10dc1af5ce2
> --- /dev/null
> +++ b/arch/arm64/boot/dts/freescale/imx95-19x19-evk-sof.dts
> @@ -0,0 +1,86 @@
> +// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
> +/*
> + * Copyright 2024 NXP
> + */
> +
> +/dts-v1/;
> +
> +#include "imx95-19x19-evk.dts"
> +
> +/ {
> + reserved-memory {
> + adma_res: memory@8610 {
> + compatible = "shared-dma-pool";
> + reg = <0x0 0x8610 0x0 0x10>;
> + no-map;
> + };
> + };
> +
> + sound-wm8962 {
> + status = "disabled";
> + };
> +
> + sof-sound-wm8962 {
> + compatible = "audio-graph-card2";
> +
> + links = <&cpu>;
> + label = "wm8962-audio";
> +
> + hp-det-gpios = <&gpio2 11 GPIO_ACTIVE_HIGH>;
> +
> + pinctrl-names = "default";
> + pinctrl-0 = <&pinctrl_hp>;
> +
> + widgets =
> + "Headphone", "Headphones",
> + "Microphone", "Headset Mic";
> + routing =
> + "Headphones", "HPOUTL",
> + "Headphones", "HPOUTR",
> + "Headset Mic", "MICBIAS",
> + "IN3R", "Headset Mic",
> + "IN1R", "Headset Mic";
> + };
> +
> + sof_cpu: cm7-cpu@8000 {
> + compatible = "fsl,imx95-cm7-sof";

needn't space, and remove other extra space.

Can you try https://github.com/lznuaa/dt-format
to order these nodes and properties.

Frank

> +
> + reg = <0x0 0x8000 0x0 0x40>,
> +   <0x0 0x8600 0x0 0x3000>;
> + reg-names = "dram", "mailbox";
> +
> + memory-region = <&adma_res>;
> +
> + mboxes = <&mu7 2 0>, <&mu7 2 1>, <&mu7 3 0>, <&mu7 3 1>;
> + mbox-names = "txdb0", "txdb1", "rxdb0", "rxdb1";
> +
> + cpu: port {
> + cpu_ep: endpoint { remote-endpoint = <&codec_ep>; };
> + };
> + };
> +};
> +
> +&wm8962 {
> + assigned-clocks = <&scmi_clk IMX95_CLK_AUDIOPLL1_VCO>,
> +   <&scmi_clk IMX95_CLK_AUDIOPLL2_VCO>,
> +   <&scmi_clk IMX95_CLK_AUDIOPLL1>,
> +   <&scmi_clk IMX95_CLK_AUDIOPLL2>,
> +   <&scmi_clk IMX95_CLK_SAI3>;
> + assigned-clock-parents = <0>, <0>, <0>, <0>, <&scmi_clk 
> IMX95_CLK_AUDIOPLL1>;
> + assigned-clock-rates = <393216>, <3612672000>,
> +<393216000>, <361267200>,
> +<12288000>;
> + status = "okay";
> +
> + port {
> + codec_ep: endpoint { remote-endpoint = <&cpu_ep>; };
> + };
> +};
> +
> +&edma2 {
> + dma-channel-mask = <0x3fff>, <0x>;
> +};
> +
> +&sai3 {
> + status = "disabled";
> +};
> --
> 2.34.1
>

[PATCH net-next v5 08/12] selftests: ncdevmem: Use YNL to enable TCP header split

2024-10-23 Thread Stanislav Fomichev

In the next patch the hard-coded queue numbers are gonna be removed.
So introduce some initial support for ethtool YNL and use
it to enable header split.

Also, tcp-data-split requires latest ethtool which is unlikely
to be present in the distros right now.

(ideally, we should not shell out to ethtool at all).

Reviewed-by: Mina Almasry 
Signed-off-by: Stanislav Fomichev 
---
 tools/testing/selftests/net/Makefile   |  2 +-
 tools/testing/selftests/net/ncdevmem.c | 57 +-
 2 files changed, 56 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/net/Makefile 
b/tools/testing/selftests/net/Makefile
index 26a4883a65c9..759b1d2dc8b4 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -111,7 +111,7 @@ TEST_INCLUDES := forwarding/lib.sh
 include ../lib.mk
 
 # YNL build
-YNL_GENS := netdev
+YNL_GENS := ethtool netdev
 include ynl.mk
 
 $(OUTPUT)/epoll_busy_poll: LDLIBS += -lcap
diff --git a/tools/testing/selftests/net/ncdevmem.c 
b/tools/testing/selftests/net/ncdevmem.c
index 697771c1f9fa..e1faad46548b 100644
--- a/tools/testing/selftests/net/ncdevmem.c
+++ b/tools/testing/selftests/net/ncdevmem.c
@@ -55,10 +55,12 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
 #include "netdev-user.h"
+#include "ethtool-user.h"
 #include 
 
 #define PAGE_SHIFT 12
@@ -231,10 +233,58 @@ static int reset_flow_steering(void)
return 0;
 }
 
+static const char *tcp_data_split_str(int val)
+{
+   switch (val) {
+   case 0:
+   return "off";
+   case 1:
+   return "auto";
+   case 2:
+   return "on";
+   default:
+   return "?";
+   }
+}
+
 static int configure_headersplit(bool on)
 {
-   return run_command("sudo ethtool -G %s tcp-data-split %s >&2", ifname,
-  on ? "on" : "off");
+   struct ethtool_rings_get_req *get_req;
+   struct ethtool_rings_get_rsp *get_rsp;
+   struct ethtool_rings_set_req *req;
+   struct ynl_error yerr;
+   struct ynl_sock *ys;
+   int ret;
+
+   ys = ynl_sock_create(&ynl_ethtool_family, &yerr);
+   if (!ys) {
+   fprintf(stderr, "YNL: %s\n", yerr.msg);
+   return -1;
+   }
+
+   req = ethtool_rings_set_req_alloc();
+   ethtool_rings_set_req_set_header_dev_index(req, ifindex);
+   /* 0 - off, 1 - auto, 2 - on */
+   ethtool_rings_set_req_set_tcp_data_split(req, on ? 2 : 0);
+   ret = ethtool_rings_set(ys, req);
+   if (ret < 0)
+   fprintf(stderr, "YNL failed: %s\n", ys->err.msg);
+   ethtool_rings_set_req_free(req);
+
+   if (ret == 0) {
+   get_req = ethtool_rings_get_req_alloc();
+   ethtool_rings_get_req_set_header_dev_index(get_req, ifindex);
+   get_rsp = ethtool_rings_get(ys, get_req);
+   ethtool_rings_get_req_free(get_req);
+   if (get_rsp)
+   fprintf(stderr, "TCP header split: %s\n",
+   tcp_data_split_str(get_rsp->tcp_data_split));
+   ethtool_rings_get_rsp_free(get_rsp);
+   }
+
+   ynl_sock_destroy(ys);
+
+   return ret;
 }
 
 static int configure_rss(void)
@@ -370,6 +420,9 @@ int do_server(struct memory_buffer *mem)
if (reset_flow_steering())
error(1, 0, "Failed to reset flow steering\n");
 
+   if (configure_headersplit(1))
+   error(1, 0, "Failed to enable TCP header split\n");
+
/* Configure RSS to divert all traffic from our devmem queues */
if (configure_rss())
error(1, 0, "Failed to configure rss\n");
-- 
2.47.0

Re: [PATCH v4 1/2] virt: pvmemcontrol: control guest physical memory properties

2024-10-23 Thread kernel test robot

Hi Yuanchu,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.12-rc4 next-20241023]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:
https://github.com/intel-lab-lkp/linux/commits/Yuanchu-Xie/virt-pvmemcontrol-add-Yuanchu-and-Pasha-as-maintainers/20241022-045035
base:   linus/master
patch link:
https://lore.kernel.org/r/20241021204849.1580384-1-yuanchu%40google.com
patch subject: [PATCH v4 1/2] virt: pvmemcontrol: control guest physical memory 
properties
config: powerpc-randconfig-002-20241024 
(https://download.01.org/0day-ci/archive/20241024/202410240605.hr0jfxmp-...@intel.com/config)
compiler: powerpc-linux-gcc (GCC) 14.1.0
reproduce (this is a W=1 build): 
(https://download.01.org/0day-ci/archive/20241024/202410240605.hr0jfxmp-...@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot 
| Closes: 
https://lore.kernel.org/oe-kbuild-all/202410240605.hr0jfxmp-...@intel.com/

All error/warnings (new ones prefixed by >>):

>> drivers/virt/pvmemcontrol/pvmemcontrol.c:455:1: warning: data definition has 
>> no type or storage class
 455 | module_pci_driver(pvmemcontrol_pci_driver);
 | ^
>> drivers/virt/pvmemcontrol/pvmemcontrol.c:455:1: error: type defaults to 
>> 'int' in declaration of 'module_pci_driver' [-Wimplicit-int]
>> drivers/virt/pvmemcontrol/pvmemcontrol.c:455:1: error: parameter names 
>> (without types) in function declaration 
>> [-Wdeclaration-missing-parameter-type]
>> drivers/virt/pvmemcontrol/pvmemcontrol.c:449:26: warning: 
>> 'pvmemcontrol_pci_driver' defined but not used [-Wunused-variable]
 449 | static struct pci_driver pvmemcontrol_pci_driver = {
 |  ^~~


vim +455 drivers/virt/pvmemcontrol/pvmemcontrol.c

   448  
 > 449  static struct pci_driver pvmemcontrol_pci_driver = {
   450  .name = "pvmemcontrol",
   451  .id_table = pvmemcontrol_pci_id_tbl,
   452  .probe = pvmemcontrol_pci_probe,
   453  .remove = pvmemcontrol_pci_remove,
   454  };
 > 455  module_pci_driver(pvmemcontrol_pci_driver);
   456  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

Re: [PATCH] sched_ext: Fix function pointer type mismatches in BPF selftests

2024-10-23 Thread Tejun Heo

On Wed, Oct 23, 2024 at 02:15:00PM +0530, Vishal Chourasia wrote:
> Fix incompatible function pointer type warnings in sched_ext BPF selftests by
> explicitly casting the function pointers when initializing struct_ops.
> This addresses multiple -Wincompatible-function-pointer-types warnings from 
> the
> clang compiler where function signatures didn't match exactly.
> 
> The void * cast ensures the compiler accepts the function pointer
> assignment despite minor type differences in the parameters.

Can you repost with Signed-off-by?

Thanks.

-- 
tejun

[PATCH v2 0/2] soc: qcom: pmic_glink: Resolve failures to bring up pmic_glink

2024-10-23 Thread Bjorn Andersson

With the transition of pd-mapper into the kernel, the timing was altered
such that on some targets the initial rpmsg_send() requests from
pmic_glink clients would be attempted before the firmware had announced
intents, and the firmware reject intent requests.

Fix this

Signed-off-by: Bjorn Andersson 
---
Changes in v2:
- Introduced "intents" and fixed a few spelling mistakes in the commit
  message of patch 1
- Cleaned up log snippet in commit message of patch 2, added battery
  manager log
- Changed the arbitrary 10 second timeout to 5... Ought to be enough for
  anybody.
- Added a small sleep in the send-loop in patch 2, and by that
  refactored the loop completely.
- Link to v1: 
https://lore.kernel.org/r/20241022-pmic-glink-ecancelled-v1-0-9e26fc74e...@oss.qualcomm.com

---
Bjorn Andersson (2):
  rpmsg: glink: Handle rejected intent request better
  soc: qcom: pmic_glink: Handle GLINK intent allocation rejections

 drivers/rpmsg/qcom_glink_native.c | 10 +++---
 drivers/soc/qcom/pmic_glink.c | 25 ++---
 2 files changed, 29 insertions(+), 6 deletions(-)
---
base-commit: 42f7652d3eb527d03665b09edac47f85fb600924
change-id: 20241022-pmic-glink-ecancelled-d899a9ca0358

Best regards,
-- 
Bjorn Andersson

Re: [PATCH v4 18/19] kbuild: Add gendwarfksyms as an alternative to genksyms

2024-10-23 Thread Sami Tolvanen

Hi,

On Wed, Oct 23, 2024 at 2:59 PM Petr Pavlu  wrote:
>
> On 10/8/24 20:38, Sami Tolvanen wrote:
> > +gendwarfksyms := scripts/gendwarfksyms/gendwarfksyms
> > +getexportsymbols = $(NM) $(1) | sed -n 's/.* __export_symbol_\(.*\)/$(2)/p'
> > +
> >  genksyms = scripts/genksyms/genksyms \
> >   $(if $(1), -T $(2)) \
> >   $(if $(KBUILD_PRESERVE), -p)\
> >   -r $(or $(wildcard $(2:.symtypes=.symref)), /dev/null)
> >
> >  # These mirror gensymtypes_S and co below, keep them in synch.
> > +ifdef CONFIG_GENDWARFKSYMS
> > +symtypes_dep_c = $(obj)/%.o
> > +cmd_gensymtypes_c = $(if $(skip_gendwarfksyms),, \
> > + $(call getexportsymbols,$(2:.symtypes=.o),\1) | \
> > + $(gendwarfksyms) $(2:.symtypes=.o) $(if $(1), --symtypes $(2)))
>
> Is it possible to pass options to gendwarfksyms that apply to the entire
> build, specifically, how can one say to use the --stable option? If not
> then I think it would be good to add something as
> KBUILD_GENDWARFKSYMS_STABLE (similar to KBUILD_PRESERVE), or maybe
> a generic GENDWARFKSYMSFLAGS?

Yeah, I left that as an exercise to the user in this version, but I
agree, adding a KBUILD_ flag seems reasonable.

Sami

[PATCH 1/6] kselftest/arm64: Correct misleading comments on fp-stress irritators

2024-10-23 Thread Mark Brown

The comments in the handlers for the irritator signal in the test threads
for fp-stress suggest that the irritator will corrupt the register state
observed by the main thread but this is not the case, instead the FPSIMD
and SVE irritators (which are the only ones that are implemented) modify
the current register state which is expected to be overwritten on return
from the handler by the saved register state. Update the comment to reflect
what the handler is actually doing.

Signed-off-by: Mark Brown 
---
 tools/testing/selftests/arm64/fp/fpsimd-test.S | 3 +--
 tools/testing/selftests/arm64/fp/sve-test.S| 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/arm64/fp/fpsimd-test.S 
b/tools/testing/selftests/arm64/fp/fpsimd-test.S
index 
8b960d01ed2e0ef516893b68794078ddf8c01e1f..bdfb7cf2e4ec175fda62c1c2f38c6ebb1a1c48bf
 100644
--- a/tools/testing/selftests/arm64/fp/fpsimd-test.S
+++ b/tools/testing/selftests/arm64/fp/fpsimd-test.S
@@ -134,8 +134,7 @@ function check_vreg
b   memcmp
 endfunction
 
-// Any SVE register modified here can cause corruption in the main
-// thread -- but *only* the registers modified here.
+// Modify live register state, the signal return will undo our changes
 function irritator_handler
// Increment the irritation signal count (x23):
ldr x0, [x2, #ucontext_regs + 8 * 23]
diff --git a/tools/testing/selftests/arm64/fp/sve-test.S 
b/tools/testing/selftests/arm64/fp/sve-test.S
index 
fff60e2a25addfd4850ef71aa3cf6535ac880ffd..e3c0d585684df29723a49265f3df6d23817498c7
 100644
--- a/tools/testing/selftests/arm64/fp/sve-test.S
+++ b/tools/testing/selftests/arm64/fp/sve-test.S
@@ -291,8 +291,7 @@ function check_ffr
 #endif
 endfunction
 
-// Any SVE register modified here can cause corruption in the main
-// thread -- but *only* the registers modified here.
+// Modify live register state, the signal return will undo our changes
 function irritator_handler
// Increment the irritation signal count (x23):
ldr x0, [x2, #ucontext_regs + 8 * 23]

-- 
2.39.2

[PATCH] vsock/test: fix failures due to wrong SO_RCVLOWAT parameter

2024-10-23 Thread Konstantin Shkolnyy

This happens on 64-bit big-endian machines.
SO_RCVLOWAT requires an int parameter. However, instead of int, the test
uses unsigned long in one place and size_t in another. Both are 8 bytes
long on 64-bit machines. The kernel, having received the 8 bytes, doesn't
test for the exact size of the parameter, it only cares that it's >=
sizeof(int), and casts the 4 lower-addressed bytes to an int, which, on
a big-endian machine, contains 0. 0 doesn't trigger an error, SO_RCVLOWAT
returns with success and the socket stays with the default SO_RCVLOWAT = 1,
which results in test failures.

Signed-off-by: Konstantin Shkolnyy 
---

Notes:
The problem was found on s390 (big endian), while x86-64 didn't show it. 
After this fix, all tests pass on s390.

 tools/testing/vsock/vsock_test.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 8d38dbf8f41f..7fd25b814b4b 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -835,7 +835,7 @@ static void test_stream_poll_rcvlowat_server(const struct 
test_opts *opts)
 
 static void test_stream_poll_rcvlowat_client(const struct test_opts *opts)
 {
-   unsigned long lowat_val = RCVLOWAT_BUF_SIZE;
+   int lowat_val = RCVLOWAT_BUF_SIZE;
char buf[RCVLOWAT_BUF_SIZE];
struct pollfd fds;
short poll_flags;
@@ -1357,7 +1357,7 @@ static void 
test_stream_rcvlowat_def_cred_upd_client(const struct test_opts *opt
 static void test_stream_credit_update_test(const struct test_opts *opts,
   bool low_rx_bytes_test)
 {
-   size_t recv_buf_size;
+   int recv_buf_size;
struct pollfd fds;
size_t buf_size;
void *buf;
-- 
2.34.1

Re: [PATCH v4 15/19] gendwarfksyms: Add support for reserved and ignored fields

2024-10-23 Thread Sami Tolvanen

On Wed, Oct 23, 2024 at 2:53 PM Petr Pavlu  wrote:
>
> I've noted some nits above which you might want to trivially address.

These all looked reasonable to me, I'll address them in the next version.

> Otherwise this looks ok to me, feel free to add:
>
> Reviewed-by: Petr Pavlu 

Thanks!

Sami

Re: [PATCH 2/2] soc: qcom: pmic_glink: Handle GLINK intent allocation rejections

2024-10-23 Thread Bjorn Andersson

On Tue, Oct 22, 2024 at 05:30:55PM GMT, Johan Hovold wrote:
> On Tue, Oct 22, 2024 at 04:17:12AM +, Bjorn Andersson wrote:
[..]
> > Reported-by: Johan Hovold 
> > Closes: https://lore.kernel.org/all/zqet8iinndhnx...@hovoldconsulting.com/#t
> 
> This indeed seems to fix the -ECANCELED related errors I reported above,
> but the audio probe failure still remains as expected:
> 
>   PDR: avs/audio get domain list txn wait failed: -110
>   PDR: service lookup for avs/audio failed: -110
> 
> I hit it on the third reboot and then again after another 75 reboots
> (and have never seen it with the user space pd-mapper over several
> hundred boots).
> 
> Do you guys have any theories as to what is causing the above with the
> in-kernel pd-mapper (beyond the obvious changes in timing)?
> 

Not yet. This would be a timeout in a completely different codepath.

I'm trying to figure out a better way to reproduce this, than just
restarting the whole machine...

Thanks for the review.

Regards,
Bjorn

Re: [PATCH rcu] srcu: Guarantee non-negative return value from srcu_read_lock()

2024-10-23 Thread Andrii Nakryiko

On Wed, Oct 23, 2024 at 9:46 AM Alan Huang  wrote:
>
> On Oct 24, 2024, at 00:34, Andrii Nakryiko  wrote:
> >
> > On Tue, Oct 22, 2024 at 11:40 PM Christoph Hellwig  
> > wrote:
> >>
> >> On Tue, Oct 22, 2024 at 10:29:13AM -0700, Andrii Nakryiko wrote:
> 
>  Would this work?
> 
>  #define SRCU_INVALID_INDEX -1
> 
> >>>
> >>> But why?
> >>
> >> Becaue it very clearly documents what is going on.
> >
> > So does saying "returned value is going to be non-negative, always".
> > It's not some weird and unusual thing in C by any means.
> >
> >>
> >>> It's a nice property to have an int-returning API where valid
> >>> values are only >= 0, so callers are free to use the entire negative
> >>> range (not just -1) for whatever they need to store in case there is
> >>> no srcu_idx value.
> >>
> >> Well, if you have a concrete use case for that we can probably live
> >> with it, but I'd rather have that use case extremely well documented,
> >> as it will be very puzzling to the reader.
> >>
> >
> > See [0] for what Peter is proposing. Note hprobe_init() and its use of
> > `srcu_idx < 0` as a mark that there is no SRCU "session" associated
> > with the uprobe. Now, if we say that SRCU_INVALID_INDEX is the only
> > invalid value, it leaves a question: what to do with other negative
> > srcu_idx values? Are they valid? Should I now WARN() on "unknown
> > invalid" values? Why all these complications? I'd rather just not have
> > a unified hprobe_init() at that point, TBH.
> >
> > But anyways, taking a step back. Take idr_alloc()/idr_alloc_cyclic()
> > APIs as an example. They return int, but valid IDs are documented to
> > be positive. This leaves users of this API free to use int to store ID
> > in their data structures, knowing that <= 0 is "no ID", and if
> > necessary, they can have multiple possible "no ID" situations.
> >
> > Same principle here. Why prescribing a randomly picked -1 as the only
> > "valid" invalid value, if anything negative is clearly impossible.
> >
>
> A IS_INVALID_SRCU_INDEX macro would suffice, you need the condition anyway.

That condition is `if (srcu_idx < 0)`, no need for ugly macros that
obscure things unnecessarily.

>
> >
> >  [0] 
> > https://lore.kernel.org/linux-trace-kernel/20241018101647.ga36...@noisy.programming.kicks-ass.net/
>
>

[PATCH V5 1/1] livepatch: Add stack_order sysfs attribute

2024-10-23 Thread Wardenjohn

Add "stack_order" sysfs attribute which holds the order in which a live
patch module was loaded into the system. A user can then determine an
active live patched version of a function.

cat /sys/kernel/livepatch/livepatch_1/stack_order -> 1

means that livepatch_1 is the first live patch applied

cat /sys/kernel/livepatch/livepatch_module/stack_order -> N

means that livepatch_module is the Nth live patch applied

Suggested-by: Petr Mladek 
Suggested-by: Miroslav Benes 
Suggested-by: Josh Poimboeuf 
Signed-off-by: Wardenjohn 
---
 .../ABI/testing/sysfs-kernel-livepatch|  9 +++
 kernel/livepatch/core.c   | 24 +++
 2 files changed, 33 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-kernel-livepatch 
b/Documentation/ABI/testing/sysfs-kernel-livepatch
index 3735d868013d..73e40d02345e 100644
--- a/Documentation/ABI/testing/sysfs-kernel-livepatch
+++ b/Documentation/ABI/testing/sysfs-kernel-livepatch
@@ -55,6 +55,15 @@ Description:
An attribute which indicates whether the patch supports
atomic-replace.
 
+What:  /sys/kernel/livepatch//stack_order
+Date:  Oct 2024
+KernelVersion: 6.13.0
+Description:
+   This attribute specifies the sequence in which live patch 
modules
+   are applied to the system. If multiple live patches modify the 
same
+   function, the implementation with the biggest 'stack_order' 
number
+   is used, unless a transition is currently in progress.
+
 What:  /sys/kernel/livepatch//
 Date:  Nov 2014
 KernelVersion: 3.19.0
diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c
index 3c21c31796db..0cd39954d5a1 100644
--- a/kernel/livepatch/core.c
+++ b/kernel/livepatch/core.c
@@ -347,6 +347,7 @@ int klp_apply_section_relocs(struct module *pmod, Elf_Shdr 
*sechdrs,
  * /sys/kernel/livepatch//transition
  * /sys/kernel/livepatch//force
  * /sys/kernel/livepatch//replace
+ * /sys/kernel/livepatch//stack_order
  * /sys/kernel/livepatch//
  * /sys/kernel/livepatch///patched
  * /sys/kernel/livepatch///
@@ -452,15 +453,38 @@ static ssize_t replace_show(struct kobject *kobj,
return sysfs_emit(buf, "%d\n", patch->replace);
 }
 
+static ssize_t stack_order_show(struct kobject *kobj,
+   struct kobj_attribute *attr, char *buf)
+{
+   struct klp_patch *patch, *this_patch;
+   int stack_order = 0;
+
+   this_patch = container_of(kobj, struct klp_patch, kobj);
+
+   mutex_lock(&klp_mutex);
+
+   klp_for_each_patch(patch) {
+   stack_order++;
+   if (patch == this_patch)
+   break;
+   }
+
+   mutex_unlock(&klp_mutex);
+
+   return sysfs_emit(buf, "%d\n", stack_order);
+}
+
 static struct kobj_attribute enabled_kobj_attr = __ATTR_RW(enabled);
 static struct kobj_attribute transition_kobj_attr = __ATTR_RO(transition);
 static struct kobj_attribute force_kobj_attr = __ATTR_WO(force);
 static struct kobj_attribute replace_kobj_attr = __ATTR_RO(replace);
+static struct kobj_attribute stack_order_kobj_attr = __ATTR_RO(stack_order);
 static struct attribute *klp_patch_attrs[] = {
&enabled_kobj_attr.attr,
&transition_kobj_attr.attr,
&force_kobj_attr.attr,
&replace_kobj_attr.attr,
+   &stack_order_kobj_attr.attr,
NULL
 };
 ATTRIBUTE_GROUPS(klp_patch);
-- 
2.18.2

[PATCH] sched_ext: Fix function pointer type mismatches in BPF selftests

2024-10-23 Thread Vishal Chourasia

Fix incompatible function pointer type warnings in sched_ext BPF selftests by
explicitly casting the function pointers when initializing struct_ops.
This addresses multiple -Wincompatible-function-pointer-types warnings from the
clang compiler where function signatures didn't match exactly.

The void * cast ensures the compiler accepts the function pointer
assignment despite minor type differences in the parameters.

Signed-off-by: Vishal Chourasia 
---
 .../selftests/sched_ext/create_dsq.bpf.c  |  6 +-
 .../sched_ext/ddsp_bogus_dsq_fail.bpf.c   |  4 +-
 .../sched_ext/ddsp_vtimelocal_fail.bpf.c  |  4 +-
 .../selftests/sched_ext/dsp_local_on.bpf.c|  8 +--
 .../sched_ext/enq_select_cpu_fails.bpf.c  |  4 +-
 tools/testing/selftests/sched_ext/exit.bpf.c  | 14 ++---
 .../testing/selftests/sched_ext/hotplug.bpf.c |  8 +--
 .../sched_ext/init_enable_count.bpf.c |  8 +--
 .../testing/selftests/sched_ext/maximal.bpf.c | 58 +--
 .../selftests/sched_ext/maybe_null.bpf.c  |  6 +-
 .../sched_ext/maybe_null_fail_dsp.bpf.c   |  4 +-
 .../sched_ext/maybe_null_fail_yld.bpf.c   |  4 +-
 .../selftests/sched_ext/prog_run.bpf.c|  2 +-
 .../selftests/sched_ext/select_cpu_dfl.bpf.c  |  2 +-
 .../sched_ext/select_cpu_dfl_nodispatch.bpf.c |  6 +-
 .../sched_ext/select_cpu_dispatch.bpf.c   |  2 +-
 .../select_cpu_dispatch_bad_dsq.bpf.c |  4 +-
 .../select_cpu_dispatch_dbl_dsp.bpf.c |  4 +-
 .../sched_ext/select_cpu_vtime.bpf.c  | 12 ++--
 19 files changed, 80 insertions(+), 80 deletions(-)

diff --git a/tools/testing/selftests/sched_ext/create_dsq.bpf.c 
b/tools/testing/selftests/sched_ext/create_dsq.bpf.c
index 23f79ed343f02..2cfc4ffd60e28 100644
--- a/tools/testing/selftests/sched_ext/create_dsq.bpf.c
+++ b/tools/testing/selftests/sched_ext/create_dsq.bpf.c
@@ -51,8 +51,8 @@ s32 BPF_STRUCT_OPS_SLEEPABLE(create_dsq_init)
 
 SEC(".struct_ops.link")
 struct sched_ext_ops create_dsq_ops = {
-   .init_task  = create_dsq_init_task,
-   .exit_task  = create_dsq_exit_task,
-   .init   = create_dsq_init,
+   .init_task  = (void *) create_dsq_init_task,
+   .exit_task  = (void *) create_dsq_exit_task,
+   .init   = (void *) create_dsq_init,
.name   = "create_dsq",
 };
diff --git a/tools/testing/selftests/sched_ext/ddsp_bogus_dsq_fail.bpf.c 
b/tools/testing/selftests/sched_ext/ddsp_bogus_dsq_fail.bpf.c
index e97ad41d354ad..37d9bf6fb7458 100644
--- a/tools/testing/selftests/sched_ext/ddsp_bogus_dsq_fail.bpf.c
+++ b/tools/testing/selftests/sched_ext/ddsp_bogus_dsq_fail.bpf.c
@@ -35,8 +35,8 @@ void BPF_STRUCT_OPS(ddsp_bogus_dsq_fail_exit, struct 
scx_exit_info *ei)
 
 SEC(".struct_ops.link")
 struct sched_ext_ops ddsp_bogus_dsq_fail_ops = {
-   .select_cpu = ddsp_bogus_dsq_fail_select_cpu,
-   .exit   = ddsp_bogus_dsq_fail_exit,
+   .select_cpu = (void *) ddsp_bogus_dsq_fail_select_cpu,
+   .exit   = (void *) ddsp_bogus_dsq_fail_exit,
.name   = "ddsp_bogus_dsq_fail",
.timeout_ms = 1000U,
 };
diff --git a/tools/testing/selftests/sched_ext/ddsp_vtimelocal_fail.bpf.c 
b/tools/testing/selftests/sched_ext/ddsp_vtimelocal_fail.bpf.c
index dde7e7dafbfbc..dffc97d9cdf14 100644
--- a/tools/testing/selftests/sched_ext/ddsp_vtimelocal_fail.bpf.c
+++ b/tools/testing/selftests/sched_ext/ddsp_vtimelocal_fail.bpf.c
@@ -32,8 +32,8 @@ void BPF_STRUCT_OPS(ddsp_vtimelocal_fail_exit, struct 
scx_exit_info *ei)
 
 SEC(".struct_ops.link")
 struct sched_ext_ops ddsp_vtimelocal_fail_ops = {
-   .select_cpu = ddsp_vtimelocal_fail_select_cpu,
-   .exit   = ddsp_vtimelocal_fail_exit,
+   .select_cpu = (void *) ddsp_vtimelocal_fail_select_cpu,
+   .exit   = (void *) ddsp_vtimelocal_fail_exit,
.name   = "ddsp_vtimelocal_fail",
.timeout_ms = 1000U,
 };
diff --git a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c 
b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
index efb4672decb41..6a7db1502c29e 100644
--- a/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
+++ b/tools/testing/selftests/sched_ext/dsp_local_on.bpf.c
@@ -56,10 +56,10 @@ void BPF_STRUCT_OPS(dsp_local_on_exit, struct scx_exit_info 
*ei)
 
 SEC(".struct_ops.link")
 struct sched_ext_ops dsp_local_on_ops = {
-   .select_cpu = dsp_local_on_select_cpu,
-   .enqueue= dsp_local_on_enqueue,
-   .dispatch   = dsp_local_on_dispatch,
-   .exit   = dsp_local_on_exit,
+   .select_cpu = (void *) dsp_local_on_select_cpu,
+   .enqueue= (void *) dsp_local_on_enqueue,
+   .dispatch   = (void *) dsp_local_on_dispatch,
+   .exit   =

Re: [PATCH v2 2/2] soc: qcom: pmic_glink: Handle GLINK intent allocation rejections

2024-10-23 Thread Johan Hovold

On Wed, Oct 23, 2024 at 05:24:33PM +, Bjorn Andersson wrote:
> Some versions of the pmic_glink firmware does not allow dynamic GLINK
> intent allocations, attempting to send a message before the firmware has
> allocated its receive buffers and announced these intent allocations
> will fail.

> Retry the send until intent buffers becomes available, or an actual
> error occur.

> Reported-by: Johan Hovold 
> Closes: https://lore.kernel.org/all/zqet8iinndhnx...@hovoldconsulting.com/#t
> Cc: sta...@vger.kernel.org # rpmsg: glink: Handle rejected intent request 
> better
> Fixes: 58ef4ece1e41 ("soc: qcom: pmic_glink: Introduce base PMIC GLINK 
> driver")
> Tested-by: Johan Hovold 
> Reviewed-by: Johan Hovold 
> Signed-off-by: Bjorn Andersson 

Thanks for the update. Still works as intended here.

>  int pmic_glink_send(struct pmic_glink_client *client, void *data, size_t len)
>  {
>   struct pmic_glink *pg = client->pg;
> + bool timeout_reached = false;
> + unsigned long start;
>   int ret;
>  
>   mutex_lock(&pg->state_lock);
> - if (!pg->ept)
> + if (!pg->ept) {
>   ret = -ECONNRESET;
> - else
> - ret = rpmsg_send(pg->ept, data, len);
> + } else {
> + start = jiffies;
> + for (;;) {
> + ret = rpmsg_send(pg->ept, data, len);
> + if (ret != -EAGAIN)
> + break;
> +
> + if (timeout_reached) {
> + ret = -ETIMEDOUT;
> + break;
> + }
> +
> + usleep_range(1000, 5000);

I ran some quick tests of this patch this morning (reproducing the issue
five times), and with the above delay it seems a single resend is
enough. Dropping the delay I once hit:

[8.723479] qcom_pmic_glink pmic-glink: pmic_glink_send - resend
[8.723877] qcom_pmic_glink pmic-glink: pmic_glink_send - resend
[8.723921] qcom_pmic_glink pmic-glink: pmic_glink_send - resend
[8.723951] qcom_pmic_glink pmic-glink: pmic_glink_send - resend
[8.723981] qcom_pmic_glink pmic-glink: pmic_glink_send - resend
[8.724010] qcom_pmic_glink pmic-glink: pmic_glink_send - resend
[8.724046] qcom_pmic_glink pmic-glink: pmic_glink_send - resend

which seems to suggest that a one millisecond sleep is sufficient for
the currently observed issue.

It would still mean up to 5k calls if you ever try to send a too large
buffer or similar and spin here for five seconds however. Perhaps
nothing to worry about at this point, but increasing the delay or
lowering the timeout could be considered.

> + timeout_reached = time_after(jiffies, start + 
> PMIC_GLINK_SEND_TIMEOUT);
> + }
> + }
>   mutex_unlock(&pg->state_lock);
>  
>   return ret;

Johan

93 matches

Mail list logo