Hello, sorry for the delayed answer. | If I understand correctly, your allocations are going to swap as soon as you use more than half of your available local RAM? Not directly. The amount of local RAM I can use is connected to the number of processor dies I use for an application. If I use half of all available processor dies I can use half of my local RAM before my allocations go to swap. If I use a quarter of the available processor dies I can only use a quarter of my RAM before my allocations go to swap. And so on.
I do not use SLURM to start my applications. I looked into the cgroups but I do not have a solid understanding of them yet. I will ask the system admin if he knows of a possible connection. Thanks for the reply. Mike Am Mo., 3. Okt. 2022 um 15:08 Uhr schrieb François Galea <fga...@free.fr>: > > 26 sept. 2022 09:36:51 Brice Goglin <brice.gog...@inria.fr>: > > Hello > > If I understand correctly, your allocations are going to swap as soon as > you use more than half of your available local RAM? I don't see any reason > for this unless some additional limit is preventing you from using more. Is > there any chance your job was allocated with less memory (for instance > SLURM uses Linux cgroups)? > > IIRC, cat /proc/self/cgroup will tell you if you are in a "memory" cgroup > and its path. Then you can go in something like > /sys/fs/cgroup/memory/<path> and dig in files there. > > For the record, hwloc takes care of ignoring NUMA nodes that are not > available due to cgroups. However cgroup memory limits are more difficult > because they don't apply to specific nodes but rather to the entire > available memory in the machine, hence it's not easy to expose them in > hwloc. > > Brice > > > > Le 25/09/2022 à 12:40, Mike a écrit : > > Dear list, > > I work on two AMD EPYC 7713 64-Core processors. There are 8 dies per > processor containing 8 cores. The machine has 512 GB of main memory > available and 112 GB of swap. The operating system is Debian. > > I use hwloc_set_area_membind() to optimize memory access. Now I came > across behavior I cannot explain. When I do not use memory binding this > behavior will not occur. I can only use part of the main memory depending > on how many dies I use. An example: > > I use 8 processors for a calculation. I distribute them so that I fill 1 > die à 8 cores completely. When I now allocate more than 32 GB (1/16 of main > memory) of memory the first 32 GB will be stored in main memory but the > remaining memory will be stored in swap. > > When I use 8 dies à 8 cores the first 256 GB (1/2 of main memory) will be > written into the main memory until the rest gets stored into the swap and > so on. > > Is there a way to circumvent this behavior? Thanks in advance. > > PS: This is the output of lstopo -.synthetic: > Package:2 [NUMANode(memory=270369247232)] L3Cache:8(size=33554432) > L2Cache:8(size=524288) L1dCache:1(size=32768) L1iCache:1(size=32768) Core:1 > PU:2(indexes=2*128:1*2) > > Mike > > _______________________________________________ > hwloc-users mailing > listhwloc-us...@lists.open-mpi.orghttps://lists.open-mpi.org/mailman/listinfo/hwloc-users > > _______________________________________________ > hwloc-users mailing list > hwloc-users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/hwloc-users
_______________________________________________ hwloc-users mailing list hwloc-users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/hwloc-users