You are right.

I didn't know that SGE used limits other than '/etc/security/limits.conf', even 
though you explained it :/

The resolution is by adding 'H_MEMORYLOCKED=unlimited' in the execd_params.

Thank you all for your time and efforts and keep up the great work :)

Waleed Lotfy
Bibliotheca Alexandrina
________________________________________
From: users [users-boun...@open-mpi.org] on behalf of Gus Correa 
[g...@ldeo.columbia.edu]
Sent: Tuesday, January 06, 2015 9:11 PM
To: Open MPI Users
Subject: Re: [OMPI users] Icreasing OFED registerable memory

Hi Waleed

As Devendar said (and I tried to explain before),
you need to allow the locked memory limit to be unlimited for
user processes (in /etc/security/limits.conf),
*AND* somehow the daemon/job_script/whatever that launches the mpiexec
command must request "ulimit -l unlimited" (directly or indirectly).
The latter part depends on how your system's details.
I am not familiar to SGE (I use Torque), but presumably you can
add "ulimit -l unlimited" when you launch
the SGE daemons on the nodes.
Presumably this will make the processes launched by that daemon
(i.e. your mpiexec) inherit those limits,
and that is how I do it on Torque.
A more brute force way is just to include "ulimit -l unlimited"
in you job script before mpiexec.
Inserting a "ulimit -a" in your jobscript may help diagnose what you
actually have.
Please, see the OMPI FAQ that I sent you before for more details.

I hope this helps,
Gus Correa

On 01/06/2015 01:37 PM, Deva wrote:
> Hi Waleed,
>
> ----------
>    Memlock limit: 65536
> ----------
>
> such a low limit should be due to per-user lock memory limit . Can you
> make sure it is  set to "unlimited" on all nodes ( "ulimit -l unlimited")?
>
> -Devendar
>
> On Tue, Jan 6, 2015 at 3:42 AM, Waleed Lotfy <waleed.lo...@bibalex.org
> <mailto:waleed.lo...@bibalex.org>> wrote:
>
>     Hi guys,
>
>     Sorry for getting back so late, but we ran into some problems during
>     the installation process and as soon as the system came up I tested
>     the new versions for the problem but it showed another memory
>     related warning.
>
>     --------------------------------------------------------------------------
>     The OpenFabrics (openib) BTL failed to initialize while trying to
>     allocate some locked memory.  This typically can indicate that the
>     memlock limits are set too low.  For most HPC installations, the
>     memlock limits should be set to "unlimited".  The failure occured
>     here:
>
>        Local host:    comp003.local
>        OMPI source:   btl_openib_component.c:1200
>        Function:      ompi_free_list_init_ex_new()
>        Device:        mlx4_0
>        Memlock limit: 65536
>
>     You may need to consult with your system administrator to get this
>     problem fixed.  This FAQ entry on the Open MPI web site may also be
>     helpful:
>
>     http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
>     --------------------------------------------------------------------------
>     --------------------------------------------------------------------------
>     WARNING: There was an error initializing an OpenFabrics device.
>
>        Local host:   comp003.local
>        Local device: mlx4_0
>     --------------------------------------------------------------------------
>
>     <<<Then the output of the program follows.>>>
>
>     My current running versions:
>
>     OpenMPI: 1.6.4
>     OFED-internal-2.3-2
>
>     I checked /etc/security/limits.d/, the scheduler's configurations
>     (grid engine) and tried adding the following line to
>     /etc/modprobe.d/mlx4_core: 'options mlx4_core log_num_mtt=22
>     log_mtts_per_seg=1' as suggested by Gus.
>
>     I am running out of ideas here, so please any help is appreciated.
>
>     P.S. I am not sure if I should open a new thread with this issue or
>     continue with the current one, so please advice.
>
>     Waleed Lotfy
>     Bibliotheca Alexandrina
>     _______________________________________________
>     users mailing list
>     us...@open-mpi.org <mailto:us...@open-mpi.org>
>     Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>     Link to this post:
>     http://www.open-mpi.org/community/lists/users/2015/01/26107.php
>
>
>
>
> --
>
>
> -Devendar
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/01/26109.php
>

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/01/26111.php

Reply via email to