Watcha, have recently come to install the PISM package on top of PETSc, which, in turn is built against OpenMPI 1.8.1 on our Science Faculty HPC Facility, which has SGI C2112 compute nodes with 64GB RAM running on top of CentOS 6.
In testing the PETSc deployment out and when running PISM itself, I am seeing the " ...OpenFabrics subsystem is configured to only allow registering part of your physical memory ..." message telling me Registerable memory: 32768 MiB Total memory: 524285 MiB Oh yeah, that's the 512GB big memory node, not a 64Gb compute node, which says Registerable memory: 32768 MiB Total memory: 65534 MiB but still suggests a default for allowing the use of 32GB. So, having followed my nose to the OpenMPI FAQ sections, and the Mellanox community page, http://community.mellanox.com/docs/DOC-1120 which suggests the defaults for the two parameters in need of a tweak are log_num_mtt 20 log_mtts_per_seg 0 I came to try and tweak those Mellanox driver parameters. What I see on my compute nodes is # cat /sys/module/mlx4_core/parameters/log_num_mtt 0 # cat /sys/module/mlx4_core/parameters/log_mtts_per_seg 3 # so something that doesn't match the defaults the Mellanox page suggests I should be seeing. Furthermore, having "done the math" and realised that I probably want log_num_mtt 22 log_mtts_per_seg 3 to allow OpenMPI to use double the memory (128GB - because giving it 1 TB on the big memory node seems excessive!) when I come to alter those values, I can't seem to. Trying to add a module load option options mlx4_core log_num_mtt=22 via modifying the file /etc/modprobe.d/mlx4.conf never sees that value honoured after a full node reboot. It also appears that the /sys/module/mlx4_core/parameters/ are nearly all read-only, including the ones it's suggested that I tweak, vis: # echo 22 > /sys/module/mlx4_core/parameters/log_num_mtt -bash: /sys/module/mlx4_core/parameters/log_num_mtt: Permission denied # ls -l /sys/module/mlx4_core/parameters/log_num_mtt -r--r--r--. 1 root root 4096 Dec 5 13:08 /sys/module/mlx4_core/parameters/log_num_mtt so I'm getting the impression that the Mrellanox driver doesn't really want the defaults altered ? OK, so if i can't tell my nodes to allow OpenMPI to use any more than 32GB, how do I turn off the OpenMPI message that is telling me about it? Kevin M. Buckley eScience Consultant School of Engineering and Computer Science Victoria University of Wellington New Zealand