Hi Waleed, It is highly recommended to upgrade to latest OFED. Meanwhile, Can you try latest OMPI release (v1.8.4), where this warning is ignored on older OFEDs
-Devendar On Sun, Dec 28, 2014 at 6:03 AM, Waleed Lotfy <waleed.lo...@bibalex.org> wrote: > I have a bunch of 8 GB memory nodes in a cluster who were lately > upgraded to 16 GB. When I run any jobs I get the following warning: > -------------------------------------------------------------------------- > WARNING: It appears that your OpenFabrics subsystem is configured to > only > allow registering part of your physical memory. This can cause MPI jobs > to > run with erratic performance, hang, and/or crash. > > This may be caused by your OpenFabrics vendor limiting the amount of > physical memory that can be registered. You should investigate the > relevant Linux kernel module parameters that control how much physical > memory can be registered, and increase them to allow registering all > physical memory on your machine. > > See this Open MPI FAQ item for more information on these Linux kernel > module > parameters: > > http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages > > Local host: comp022.local > Registerable memory: 8192 MiB > Total memory: 16036 MiB > > Your MPI job will continue, but may be behave poorly and/or hang. > -------------------------------------------------------------------------- > > Searching for a fix to this issue, I found that I have to set > log_num_mtt within the kernel module, so I added this line to > modprobe.conf: > > options mlx4_core log_num_mtt=21 > > But then ib0 interface fails to start showing this error: > ib_ipoib device ib0 does not seem to be present, delaying > initialization. > > Reducing the value of log_num_mtt to 20, allows ib0 to start but shows > the registerable memory of 8 GB warning. > > I am using OFED 1.3.1, I know it is pretty old and we are planning to > upgrade soon. > > Output on all nodes for 'ompi_info -v ompi full --parsable': > > ompi:version:full:1.2.7 > ompi:version:svn:r19401 > orte:version:full:1.2.7 > orte:version:svn:r19401 > opal:version:full:1.2.7 > opal:version:svn:r19401 > > Any help would be appreciated. > > Waleed Lotfy > Bibliotheca Alexandrina > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/12/26076.php > -- -Devendar