On Wednesday, November 10, 2010 05:27:49 pm Brice Goglin wrote:
> Le 10/11/2010 15:02, Jirka Hladky a écrit :
> >>> 2) hwloc-bind --get --membind is not working for me (RHEL 6.0)
> >>> $ hwloc-bind --membind node:1 --mempolicy interleave -- hwloc-bind
> >>> --get -- membind
> >>> hwloc_get_membind failed (errno 22 Invalid argument)
> >>
> >> You get the same error when running only "hwloc-bind --get --membind",
> >> right?
> >
> > Yes:
> > $ hwloc-bind --get --membind
> > hwloc_get_membind failed (errno 22 Invalid argument)
> >
> >> I am not sure about this one. Do you have NUMA support in your kernel?
> >> Is your machine NUMA? Can you send the gather-topology tarball ? (if we
> >> don't have it already :))
> >
> > Yes, it's a NUMA box with NUMA support in kernel.
>
> Unfortunately, I can't reproduce. I tried with your tarball, with a
> Redhat 5 machine, with a similar Nehalem-based machine running Debian.
>
> Can you try to debug this? I'd like to know if EINVAL is returned by the
> kernel or by hwloc. You'd have to open src/topology-linux.c, go in
> function hwloc_linux_get_thisthread_membind() and add some printf there
> to check where EINVAL comes from.
>
> thanks,
> Brice
Hi Brice,
I have added some printf and perror. EINVAL is coming from get_mempolicy call:
==============================================================
/* compute max_os_index */
complete_nodeset = hwloc_topology_get_complete_nodeset(topology);
if (complete_nodeset) {
max_os_index = hwloc_bitmap_last(complete_nodeset);
printf("max_os_index %u\n",max_os_index);
if (max_os_index == (unsigned) -1)
max_os_index = 0;
} else {
max_os_index = 0;
}
printf("max_os_index %u\n",max_os_index);
/* round up to the nearest multiple of BITS_PER_LONG */
max_os_index = (max_os_index + HWLOC_BITS_PER_LONG) & ~(HWLOC_BITS_PER_LONG
- 1);
printf("max_os_index %u\n",max_os_index);
linuxmask = malloc(max_os_index/HWLOC_BITS_PER_LONG * sizeof(long));
if (!linuxmask) {
errno = ENOMEM;
goto out;
}
err = get_mempolicy(&linuxpolicy, linuxmask, max_os_index, 0, 0);
if (err < 0) {
perror("get_mempolicy");
goto out_with_mask;
}
==========================================================================
On system with 2 NUMA nodes:
$ utils/hwloc-bind --get --membind
max_os_index 1
max_os_index 1
max_os_index 64
get_mempolicy: Invalid argument
hwloc_get_membind failed (errno 22 Invalid argument)
I do not see any problem with your code. I don't know what's going on. Is
get_mempolicy itself buggy? How can I debug this?
Thanks!
Jirka