Re: [PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-12 Thread Athira Rajeev



> On 09-Apr-2022, at 10:48 PM, Arnaldo Carvalho de Melo  wrote:
> 
> Em Sat, Apr 09, 2022 at 12:28:01PM -0300, Arnaldo Carvalho de Melo escreveu:
>> Em Wed, Apr 06, 2022 at 11:21:09PM +0530, Athira Rajeev escreveu:
>>> The perf benchmark for collections: numa, futex and epoll
>>> hits failure in system configuration with CPU's more than 1024.
>>> These benchmarks uses "sched_getaffinity" and "sched_setaffinity"
>>> in the code to work with affinity.
>> 
>> Applied 1-3, 4 needs some reworking and can wait for v5.19, the first 3
>> are fixes, so can go now.
> 
> Now trying to fix this:
> 
>  26 7.89 debian:9  : FAIL gcc version 6.3.0 20170516 
> (Debian 6.3.0-18+deb9u1)
>bench/numa.c: In function 'alloc_data':
>bench/numa.c:359:6: error: 'orig_mask' may be used uninitialized in this 
> function [-Werror=maybe-uninitialized]
>  ret = sched_setaffinity(0, size, mask);
>  ^~
>bench/numa.c:409:13: note: 'orig_mask' was declared here
>  cpu_set_t *orig_mask;
> ^
>cc1: all warnings being treated as errors
>/git/perf-5.18.0-rc1/tools/build/Makefile.build:139: recipe for target 
> 'bench' failed
>make[3]: *** [bench] Error 2
> 
> 
> Happened in several distros.

Hi Arnaldo

Thanks for pointing it. I could be able to recreate this compile error in 
Debian.
The reason for this issue is variable orig_mask which is used and initialised 
in “alloc_data"
function within if condition for "init_cpu0”. We can fix this issue by 
initialising it to NULL since
it is accessed conditionally. I also made some changes to CPU_FREE the mask in 
other error paths.
I will post a V3 with these changes.

Athira

> 
> - Arnaldo



Re: [PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-09 Thread Arnaldo Carvalho de Melo
Em Sat, Apr 09, 2022 at 12:28:01PM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Wed, Apr 06, 2022 at 11:21:09PM +0530, Athira Rajeev escreveu:
> > The perf benchmark for collections: numa, futex and epoll
> > hits failure in system configuration with CPU's more than 1024.
> > These benchmarks uses "sched_getaffinity" and "sched_setaffinity"
> > in the code to work with affinity.
> 
> Applied 1-3, 4 needs some reworking and can wait for v5.19, the first 3
> are fixes, so can go now.

Now trying to fix this:

  26 7.89 debian:9  : FAIL gcc version 6.3.0 20170516 
(Debian 6.3.0-18+deb9u1)
bench/numa.c: In function 'alloc_data':
bench/numa.c:359:6: error: 'orig_mask' may be used uninitialized in this 
function [-Werror=maybe-uninitialized]
  ret = sched_setaffinity(0, size, mask);
  ^~
bench/numa.c:409:13: note: 'orig_mask' was declared here
  cpu_set_t *orig_mask;
 ^
cc1: all warnings being treated as errors
/git/perf-5.18.0-rc1/tools/build/Makefile.build:139: recipe for target 
'bench' failed
make[3]: *** [bench] Error 2


Happened in several distros.

- Arnaldo


Re: [PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-09 Thread Arnaldo Carvalho de Melo
Em Wed, Apr 06, 2022 at 11:21:09PM +0530, Athira Rajeev escreveu:
> The perf benchmark for collections: numa, futex and epoll
> hits failure in system configuration with CPU's more than 1024.
> These benchmarks uses "sched_getaffinity" and "sched_setaffinity"
> in the code to work with affinity.

Applied 1-3, 4 needs some reworking and can wait for v5.19, the first 3
are fixes, so can go now.

- Arnaldo


Re: [PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-06 Thread Athira Rajeev



> On 07-Apr-2022, at 6:05 AM, Ian Rogers  wrote:
> 
> On Wed, Apr 6, 2022 at 10:51 AM Athira Rajeev
>  wrote:
>> 
>> The perf benchmark for collections: numa, futex and epoll
>> hits failure in system configuration with CPU's more than 1024.
>> These benchmarks uses "sched_getaffinity" and "sched_setaffinity"
>> in the code to work with affinity.
>> 
>> Example snippet from numa benchmark:
>> <<>>
>> perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed.
>> Aborted (core dumped)
>> <<>>
>> 
>> bind_to_node function uses "sched_getaffinity" to save the cpumask.
>> This fails with EINVAL because the default mask size in glibc is 1024.
>> 
>> Similarly in futex and epoll benchmark, uses sched_setaffinity during
>> pthread_create with affinity. And since it returns EINVAL in such system
>> configuration, benchmark doesn't run.
>> 
>> To overcome this 1024 CPUs mask size limitation of cpu_set_t,
>> change the mask size using the CPU_*_S macros ie, use CPU_ALLOC to
>> allocate cpumask, CPU_ALLOC_SIZE for size, CPU_SET_S to set mask bit.
>> 
>> Fix all the relevant places in the code to use mask size which is large
>> enough to represent number of possible CPU's in the system.
>> 
>> Fix parse_setup_cpu_list function in numa bench to check if input CPU
>> is online before binding task to that CPU. This is to fix failures where,
>> though CPU number is within max CPU, it could happen that CPU is offline.
>> Here, sched_setaffinity will result in failure when using cpumask having
>> that cpu bit set in the mask.
>> 
>> Patch 1 and Patch 2 address fix for perf bench futex and perf bench
>> epoll benchmark. Patch 3 and Patch 4 address fix in perf bench numa
>> benchmark
>> 
>> Athira Rajeev (4):
>>  tools/perf: Fix perf bench futex to correct usage of affinity for
>>machines with #CPUs > 1K
>>  tools/perf: Fix perf bench epoll to correct usage of affinity for
>>machines with #CPUs > 1K
>>  tools/perf: Fix perf numa bench to fix usage of affinity for machines
>>with #CPUs > 1K
>>  tools/perf: Fix perf bench numa testcase to check if CPU used to bind
>>task is online
>> 
>> Changelog:
>> From v1 -> v2:
>> Addressed review comment from Ian Rogers to do
>> CPU_FREE in a cleaner way.
>> Added Tested-by from Disha Goel
> 
> 
> The whole set:
> Acked-by: Ian Rogers 

Thanks for checking Ian.

Athira.
> 
> Thanks,
> Ian
> 
>> tools/perf/bench/epoll-ctl.c   |  25 --
>> tools/perf/bench/epoll-wait.c  |  25 --
>> tools/perf/bench/futex-hash.c  |  26 --
>> tools/perf/bench/futex-lock-pi.c   |  21 +++--
>> tools/perf/bench/futex-requeue.c   |  21 +++--
>> tools/perf/bench/futex-wake-parallel.c |  21 +++--
>> tools/perf/bench/futex-wake.c  |  22 --
>> tools/perf/bench/numa.c| 105 ++---
>> tools/perf/util/header.c   |  43 ++
>> tools/perf/util/header.h   |   1 +
>> 10 files changed, 242 insertions(+), 68 deletions(-)
>> 
>> --
>> 2.35.1



[PATCH v2 0/4] Fix perf bench numa, futex and epoll to work with machines having #CPUs > 1K

2022-04-06 Thread Athira Rajeev
The perf benchmark for collections: numa, futex and epoll
hits failure in system configuration with CPU's more than 1024.
These benchmarks uses "sched_getaffinity" and "sched_setaffinity"
in the code to work with affinity.

Example snippet from numa benchmark:
<<>>
perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed.
Aborted (core dumped)
<<>>

bind_to_node function uses "sched_getaffinity" to save the cpumask.
This fails with EINVAL because the default mask size in glibc is 1024.

Similarly in futex and epoll benchmark, uses sched_setaffinity during
pthread_create with affinity. And since it returns EINVAL in such system
configuration, benchmark doesn't run.

To overcome this 1024 CPUs mask size limitation of cpu_set_t,
change the mask size using the CPU_*_S macros ie, use CPU_ALLOC to
allocate cpumask, CPU_ALLOC_SIZE for size, CPU_SET_S to set mask bit.

Fix all the relevant places in the code to use mask size which is large
enough to represent number of possible CPU's in the system.

Fix parse_setup_cpu_list function in numa bench to check if input CPU
is online before binding task to that CPU. This is to fix failures where,
though CPU number is within max CPU, it could happen that CPU is offline.
Here, sched_setaffinity will result in failure when using cpumask having
that cpu bit set in the mask.

Patch 1 and Patch 2 address fix for perf bench futex and perf bench
epoll benchmark. Patch 3 and Patch 4 address fix in perf bench numa
benchmark

Athira Rajeev (4):
  tools/perf: Fix perf bench futex to correct usage of affinity for
machines with #CPUs > 1K
  tools/perf: Fix perf bench epoll to correct usage of affinity for
machines with #CPUs > 1K
  tools/perf: Fix perf numa bench to fix usage of affinity for machines
with #CPUs > 1K
  tools/perf: Fix perf bench numa testcase to check if CPU used to bind
task is online

Changelog:
>From v1 -> v2:
 Addressed review comment from Ian Rogers to do
 CPU_FREE in a cleaner way.
 Added Tested-by from Disha Goel

 tools/perf/bench/epoll-ctl.c   |  25 --
 tools/perf/bench/epoll-wait.c  |  25 --
 tools/perf/bench/futex-hash.c  |  26 --
 tools/perf/bench/futex-lock-pi.c   |  21 +++--
 tools/perf/bench/futex-requeue.c   |  21 +++--
 tools/perf/bench/futex-wake-parallel.c |  21 +++--
 tools/perf/bench/futex-wake.c  |  22 --
 tools/perf/bench/numa.c| 105 ++---
 tools/perf/util/header.c   |  43 ++
 tools/perf/util/header.h   |   1 +
 10 files changed, 242 insertions(+), 68 deletions(-)

-- 
2.35.1