Can you read this thread and see if following grid engine param setting is
missing?

http://marc.info/?l=npaci-rocks-discussion&m=135844781420954&w=2

-----------------------
Check that your GridEngine configuration has the following:

execd_params H_MEMORYLOCKED=infinity

The command qconf -sconf will display the current configuration.
--------------------------

-Devendar

On Tue, Jan 6, 2015 at 1:37 PM, Deva <devendar.bure...@gmail.com> wrote:

> Hi Waleed,
>
> ----------
>    Memlock limit: 65536
> ----------
>
> such a low limit should be due to per-user lock memory limit . Can you
> make sure it is  set to "unlimited" on all nodes ( "ulimit -l unlimited")?
>
> -Devendar
>
> On Tue, Jan 6, 2015 at 3:42 AM, Waleed Lotfy <waleed.lo...@bibalex.org>
> wrote:
>
>> Hi guys,
>>
>> Sorry for getting back so late, but we ran into some problems during the
>> installation process and as soon as the system came up I tested the new
>> versions for the problem but it showed another memory related warning.
>>
>> --------------------------------------------------------------------------
>> The OpenFabrics (openib) BTL failed to initialize while trying to
>> allocate some locked memory.  This typically can indicate that the
>> memlock limits are set too low.  For most HPC installations, the
>> memlock limits should be set to "unlimited".  The failure occured
>> here:
>>
>>   Local host:    comp003.local
>>   OMPI source:   btl_openib_component.c:1200
>>   Function:      ompi_free_list_init_ex_new()
>>   Device:        mlx4_0
>>   Memlock limit: 65536
>>
>> You may need to consult with your system administrator to get this
>> problem fixed.  This FAQ entry on the Open MPI web site may also be
>> helpful:
>>
>>     http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> WARNING: There was an error initializing an OpenFabrics device.
>>
>>   Local host:   comp003.local
>>   Local device: mlx4_0
>> --------------------------------------------------------------------------
>>
>> <<<Then the output of the program follows.>>>
>>
>> My current running versions:
>>
>> OpenMPI: 1.6.4
>> OFED-internal-2.3-2
>>
>> I checked /etc/security/limits.d/, the scheduler's configurations (grid
>> engine) and tried adding the following line to /etc/modprobe.d/mlx4_core:
>> 'options mlx4_core log_num_mtt=22 log_mtts_per_seg=1' as suggested by Gus.
>>
>> I am running out of ideas here, so please any help is appreciated.
>>
>> P.S. I am not sure if I should open a new thread with this issue or
>> continue with the current one, so please advice.
>>
>> Waleed Lotfy
>> Bibliotheca Alexandrina
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2015/01/26107.php
>>
>
>
>
> --
>
>
> -Devendar
>



-- 


-Devendar

Reply via email to