On Fri, Apr 3, 2026 at 12:36 AM Mi, Dapeng <[email protected]> wrote: > > > On 4/3/2026 1:32 AM, Falcon, Thomas wrote: > > On Wed, 2026-04-01 at 13:40 -0700, Ian Rogers wrote: > >> On Mon, Mar 23, 2026 at 3:40 AM Venkat <[email protected]> > >> wrote: > >>> > >>> > >>>> On 15 Mar 2026, at 4:27 PM, Athira Rajeev > >>>> <[email protected]> wrote: > >>>> > >>>> Currently in "perf all PMU test", for "perf stat -e <event> > >>>> true", > >>>> below checks are done: > >>>> - if return code is zero, look for "not supported" to decide pass > >>>> scenario > >>>> - check for "not supported" to ignore the event > >>>> - looks for "No permission to enable" to skip the event. > >>>> - If output has "Bad event name", fail the test. > >>>> - Use "Access to performance monitoring and observability > >>>> operations is > >>>> limited." to ignore fail due to access limitations > >>>> > >>>> If we failed to see event and it is supported, retries with > >>>> longer > >>>> workload "perf bench internals synthesize". > >>>> - Here if output has <event>, the test is a pass. > >>>> > >>>> Snippet of code check: > >>>> ``` > >>>> output=$(perf stat -e "$p" perf bench internals synthesize 2>&1) > >>>> if echo "$output" | grep -q "$p" > >>>> ``` > >>>> - if output doesn't have event printed in logs, considers it > >>>> fail. > >>>> > >>>> But this results in false pass for events in some cases. > >>>> Example, if perf stat fails as below: > >>>> > >>>> # ./perf stat -e pmu/event/ true > >>>> event syntax error: 'pmu/event/' > >>>> \___ Bad event or PMU > >>>> > >>>> Unable to find PMU or event on a PMU of 'pmu' > >>>> Run 'perf list' for a list of valid events > >>>> > >>>> Usage: perf stat [<options>] [<command>] > >>>> > >>>> -e, --event <event> event selector. use 'perf list' to list > >>>> available events > >>>> # echo $? > >>>> 129 > >>>> > >>>> Since this has non-zero return code and doesn't have the > >>>> fail strings being checked in the test, it will enter check using > >>>> longer workload. and since the output fail log has event, it > >>>> declares test as "supported". > >>>> > >>>> Since all the fail strings can't be added in the check, update > >>>> the testcase to check return code before proceeding to longer > >>>> workload run. > >>>> > >>>> Another missing scenario is when system wide monitoring is > >>>> supported > >>>> example: > >>>> # ./perf stat -e pmu/event/ true > >>>> Error: > >>>> No supported events found. > >>>> Unsupported event (pmu/event/H) in per-thread mode, enable > >>>> system wide with '-a'. > >>>> > >>>> Update testcase to check with "perf stat -a -e $p" as well > >>>> > >>>> Signed-off-by: Athira Rajeev <[email protected]> > >>>> --- > >>> Tested this patch. > >>> > >>> > >>> With this patch: > >>> > >>> Testing hv_24x7/CPM_ADJUNCT_INST/ -- perf stat failed with non-zero > >>> return code > >>> Testing hv_24x7/CPM_ADJUNCT_PCYC/ -- perf stat failed with non-zero > >>> return code > >>> > >>> > >>> > >>> Tested-by: Venkat Rao Bagalkote <[email protected]> > >> Testing on an Intel Alderlake the test is now failing: > >> ``` > >> ... > >> Testing offcore_requests_outstanding.l3_miss_demand_data_rd -- > >> supported > >> Testing ocr.full_streaming_wr.any_response -- perf stat failed with > >> non-zero return code > >> Testing ocr.partial_streaming_wr.any_response -- perf stat failed > >> with > >> non-zero return code > >> Testing ocr.streaming_wr.any_response -- supported > >> ... > >> ``` > >> > >> Running `perf stat` manually reveals an issue with the event: > >> ``` > >> $ sudo perf stat -vv -e ocr.full_streaming_wr.any_response -a sleep > >> 1 > >> Using CPUID GenuineIntel-6-B7-1 > >> Attempt to add: cpu_atom/ocr.full_streaming_wr.any_response/ > >> ..after resolving event: > >> cpu_atom/event=0xb7,period=0x186a3,umask=0x1,offcore_rsp=0x8000000100 > >> 00/ > >> ocr.full_streaming_wr.any_response -> > >> cpu_atom/ocr.full_streaming_wr.any_response/ > >> Control descriptor is not initialized > >> ------------------------------------------------------------ > >> perf_event_attr: > >> type 10 (cpu_atom) > >> size 144 > >> ------------------------------------------------------------ > >> perf_event_attr: > >> type 0 (PERF_TYPE_HARDWARE) > >> config 0xa00000000 > >> (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/) > >> disabled 1 > >> ------------------------------------------------------------ > >> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 > >> ------------------------------------------------------------ > >> perf_event_attr: > >> type 0 (PERF_TYPE_HARDWARE) > >> config 0x400000000 > >> (cpu_core/PERF_COUNT_HW_CPU_CYCLES/) > >> disabled 1 > >> ------------------------------------------------------------ > >> sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 > >> config 0x1b7 > >> (ocr.demand_data_rd.l3_hit.snoop_hit_no_fwd) > >> sample_type IDENTIFIER > >> read_format > >> TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING > >> disabled 1 > >> inherit 1 > >> { bp_addr, config1 } 0x800000010000 > >> ------------------------------------------------------------ > >> sys_perf_event_open: pid -1 cpu 16 group_fd -1 flags 0x8 > >> sys_perf_event_open failed, error -22 > >> switching off deferred callchain support > >> Warning: > >> ocr.full_streaming_wr.any_response event is not supported by the > >> kernel. > >> The sys_perf_event_open() syscall failed for event > >> (ocr.full_streaming_wr.any_response): Invalid argument > >> "dmesg | grep -i perf" may provide additional information. > >> > >> Error: > >> No supported events found. > >> The sys_perf_event_open() syscall failed for event > >> (ocr.full_streaming_wr.any_response): Invalid argument > >> "dmesg | grep -i perf" may provide additional information. > >> ``` > >> > >> This looks like a latent Intel cpu_atom PMU bug. Thomas, wdyt? > > Hmm, it looks the error is caused by the invalid bitmask of OFFCORE_RSP_x > MSRs. Currently the valid bitmask of OFFCORE_RSP_x MSR is set to > 0x3fffffffff in intel_grt_extra_regs[], while the msr value is set > 0x800000010000 for the ocr.full_streaming_wr.any_response event. The bit 47 > is recognized an invalid bit and then abort the event creation. > > Base on the description "Table 21-56. MSR_OFFCORE_RSPx Request Type > Definition" in SDM, bit 47 should be a valid bit now. Suppose bit 47 should > not be a valid bit when adding the ADL PMU support, but it's updated and > becomes valid later. > > Along with the constant updates of perf event lists > (https://github.com/intel/perfmon), we have noticed there are mismatches > more or less between the driver hardcoded events and perfmon event list. > Currently we are summarizing the mismatches. Once these mismatches are > finalized. we would submit a patchset to fix these mismatches.
That's great, if it takes too long perhaps we could just remove the events for now. Thanks, Ian > Thanks. > > > +Dapeng, Zide, Andi > > > > Thanks, > > Tom > > > >> Thanks, > >> Ian > >> > >>> Regards, > >>> Venkat. > >>> > >>> > >>> > >>>> tools/perf/tests/shell/stat_all_pmu.sh | 20 ++++++++++++++++++++ > >>>> 1 file changed, 20 insertions(+) > >>>> > >>>> diff --git a/tools/perf/tests/shell/stat_all_pmu.sh > >>>> b/tools/perf/tests/shell/stat_all_pmu.sh > >>>> index 9c466c0efa85..6c4d59cbfa5f 100755 > >>>> --- a/tools/perf/tests/shell/stat_all_pmu.sh > >>>> +++ b/tools/perf/tests/shell/stat_all_pmu.sh > >>>> @@ -53,6 +53,26 @@ do > >>>> continue > >>>> fi > >>>> > >>>> + # check with system wide if it is supported. > >>>> + output=$(perf stat -a -e "$p" true 2>&1) > >>>> + stat_result=$? > >>>> + if echo "$output" | grep -q "not supported" > >>>> + then > >>>> + # Event not supported, so ignore. > >>>> + echo "not supported" > >>>> + continue > >>>> + fi > >>>> + > >>>> + # checked through possible access limitations and permissions. > >>>> + # At this step, non-zero return code from "perf stat" needs to > >>>> + # reported as fail for the user to investigate > >>>> + if [ $stat_result -ne 0 ] > >>>> + then > >>>> + echo "perf stat failed with non-zero return code" > >>>> + err=1 > >>>> + continue > >>>> + fi > >>>> + > >>>> # We failed to see the event and it is supported. Possibly the > >>>> workload was > >>>> # too small so retry with something longer. > >>>> output=$(perf stat -e "$p" perf bench internals synthesize > >>>> 2>&1) > >>>> -- > >>>> 2.47.3 > >>>>
