Le 12/11/2017 00:14, Biddiscombe, John A. a écrit : > I'm allocating some large matrices, from 10k squared elements up to > 40k squared per node. > I'm also using membind to place pages of the matrix memory across numa > nodes so that the matrix might be bound according to the kind of > pattern at the end of this email - where each 1 or 0 corresponds to a > 256x256 block of memory. > > The way I'm doing this is by calling hwloc_set_area_membind_nodeset > many thousands of times after allocation, and I've found that as the > matrices get bigger, then after some N calls to area_membind then I > get a failure and it returns -1 (errno does not seem to be set to > either ENOSYS or EXDEV) - but strerror report "Cannot allocate memory". > > Question 1 : by calling area_setmembind too many times, am I causing > some resource usage in the memory tables that is being exhausted. >
Hello That's likely what's happening. Each set_area() may be creating a new "virtual memory area". The kernel tries to merge them with neighbors if they go to the same NUMA node. Otherwise it creates a new VMA. I can't find the exact limit but it's something like 64k so I guess you're exhausting that. > Question 2 : Is there a better way of achieving the result I'm looking > for (such as a call to membind with a stride of some kind to say put N > pages in a row on each domain in alternation). Unfortunately, the interleave policy doesn't have a stride argument. It's one page on node 0, one page on node 1, etc. The only idea I have is to use the first-touch policy: Make sure your buffer isn't is physical memory yet, and have a thread on node 0 read the "0" pages, and another thread on node 1 read the "1" page. Brice > > Many thanks > > JB > > > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 0000000000000000111111111111111100000000 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > 1111111111111111000000000000000011111111 > ... etc > > > > > _______________________________________________ > hwloc-users mailing list > hwloc-users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/hwloc-users
_______________________________________________ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users