On Tue, Apr 15, 2025 at 09:14:36PM -0400, Joel Fernandes wrote:
>
>
> On 4/15/2025 5:15 PM, Paul E. McKenney wrote:
> > On Tue, Apr 15, 2025 at 10:59:36AM -0700, Paul E. McKenney wrote:
> >> On Tue, Apr 15, 2025 at 01:16:15PM -0400, Joel Fernandes wrote:
> >>>
> >>>
> >>> On 3/31/2025 5:03 PM, Paul E. McKenney wrote:
> >>>> This commit adds a new rcutorture.n_up_down kernel boot parameter
> >>>> that specifies the number of outstanding SRCU up/down readers, which
> >>>> begin in kthread context and end in an hrtimer handler. There is a new
> >>>> kthread ("rcu_torture_updown") that scans an per-reader array looking
> >>>> for elements whose readers have ended. This kthread sleeps between one
> >>>> and two milliseconds between consecutive scans.
> >>>>
> >>>> [ paulmck: Apply kernel test robot feedback. ]
> >>>> [ paulmck: Apply Z qiang feedback. ]
> >>>>
> >>>> Signed-off-by: Paul E. McKenney <[email protected]>
> >>>
> >>> For completeness, posting our discussion for the archives, an issue
> >>> exists in
> >>> this patch causing the following errors on an ARM64 machine with 288 CPUs:
> >>>
> >>> When running SRCU-P test, we intermittently see:
> >>>
> >>> [ 9500.806108] ??? Writer stall state RTWS_SYNC(21) g18446744073709551218
> >>> f0x0
> >>> ->state 0x2 cpu 4
> >>> [ 9515.833356] ??? Writer stall state RTWS_SYNC(21) g18446744073709551218
> >>> f0x0
> >>> ->state 0x2 cpu 4
> >>>
> >>> It bisected to just this patch.
> >>
> >> Looks like your getting rcutorture running on ARM was well timed!
>
> Yes! Glad I could help.
>
> > And could you please send along your dmesg and .config files?
>
> Sure, attached both for one of the failed runs.
Thank you! That did answer at least one of my questions. It also showed
the need for the diff below. :-/
As in kvm.sh and friends might well be missing failures in your runs.
Thanx, Paul
------------------------------------------------------------------------
diff --git a/tools/testing/selftests/rcutorture/bin/console-badness.sh
b/tools/testing/selftests/rcutorture/bin/console-badness.sh
index aad51e7c0183d..991fb11306eb6 100755
--- a/tools/testing/selftests/rcutorture/bin/console-badness.sh
+++ b/tools/testing/selftests/rcutorture/bin/console-badness.sh
@@ -10,7 +10,7 @@
#
# Authors: Paul E. McKenney <[email protected]>
-grep -E 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call
Trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall
ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved
for|!!!' |
+grep -E 'Badness|WARNING:|Warn|BUG|===========|BUG: KCSAN:|Call Trace:|Call
trace:|Oops:|detected stalls on CPUs/tasks:|self-detected stall on CPU|Stall
ended before state dump start|\?\?\? Writer stall state|rcu_.*kthread starved
for|!!!' |
grep -v 'ODEBUG: ' |
grep -v 'This means that this is a DEBUG kernel and it is' |
grep -v 'Warning: unable to open an initial console' |
diff --git a/tools/testing/selftests/rcutorture/bin/parse-console.sh
b/tools/testing/selftests/rcutorture/bin/parse-console.sh
index b07c11cf6929d..21e6ba3615f6a 100755
--- a/tools/testing/selftests/rcutorture/bin/parse-console.sh
+++ b/tools/testing/selftests/rcutorture/bin/parse-console.sh
@@ -148,7 +148,7 @@ then
summary="$summary KCSAN: $n_kcsan"
fi
fi
- n_calltrace=`grep -c 'Call Trace:' $file`
+ n_calltrace=`grep -Ec 'Call Trace:|Call trace:' $file`
if test "$n_calltrace" -ne 0
then
summary="$summary Call Traces: $n_calltrace"