Re: [OMPI users] OpenMPI 1.6.3 and Memory Issues

2012-11-29 Thread Joseph Farran

Hi again.

I am using /etc/modprobe.d/mofed.conf, otherwise I get:

WARNING: Deprecated config file /etc/modprobe.conf, all config files belong 
into /etc/modprobe.d/

But I am still getting the memory errors after making the changes and rebooting:

$ cat /etc/modprobe.d/mofed.conf
options mlx4_core log_num_mtt=24
options mlx4_core log_mtts_per_seg=1

$ mpirun hello
--
WARNING: It appears that your OpenFabrics subsystem is configured to only
allow registering part of your physical memory.  This can cause MPI jobs to
run with erratic performance, hang, and/or crash.



On 11/29/2012 04:39 PM, Yevgeny Kliteynik wrote:

You can also set these parameters in /etc/modprobe.conf:

   options mlx4_core log_num_mtt=24 log_mtts_per_seg=1

-- YK





Re: [OMPI users] OpenMPI 1.6.3 and Memory Issues

2012-11-29 Thread Yevgeny Kliteynik
You can also set these parameters in /etc/modprobe.conf:

  options mlx4_core log_num_mtt=24 log_mtts_per_seg=1

-- YK

On 11/30/2012 2:12 AM, Yevgeny Kliteynik wrote:
> On 11/30/2012 12:47 AM, Joseph Farran wrote:
>> I'll assume: /etc/modprobe.d/mlx4_en.conf
> 
> Add these to /etc/modprobe.d/mofed.conf:
> 
> options mlx4_core log_num_mtt=24
> options mlx4_core log_mtts_per_seg=1
> 
> And then restart the driver.
> You need to do it on all the machines.
> 
> -- YK
> 
>>
>> On 11/29/2012 02:34 PM, Joseph Farran wrote:
>>> Where do change those mellanox settings?
>>>
>>> On 11/29/2012 02:23 PM, Jeff Squyres wrote:
 See http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem.

 On Nov 29, 2012, at 5:21 PM, Joseph Farran wrote:

>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> 



Re: [OMPI users] OpenMPI 1.6.3 and Memory Issues

2012-11-29 Thread Yevgeny Kliteynik
On 11/30/2012 12:47 AM, Joseph Farran wrote:
> I'll assume: /etc/modprobe.d/mlx4_en.conf

Add these to /etc/modprobe.d/mofed.conf:

options mlx4_core log_num_mtt=24
options mlx4_core log_mtts_per_seg=1

And then restart the driver.
You need to do it on all the machines.

-- YK

> 
> On 11/29/2012 02:34 PM, Joseph Farran wrote:
>> Where do change those mellanox settings?
>>
>> On 11/29/2012 02:23 PM, Jeff Squyres wrote:
>>> See http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem.
>>>
>>> On Nov 29, 2012, at 5:21 PM, Joseph Farran wrote:
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 



Re: [OMPI users] OpenMPI 1.6.3 and Memory Issues

2012-11-29 Thread Joseph Farran

I'll assume:  /etc/modprobe.d/mlx4_en.conf


On 11/29/2012 02:34 PM, Joseph Farran wrote:

Where do change those mellanox settings?

On 11/29/2012 02:23 PM, Jeff Squyres wrote:

See http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem.

On Nov 29, 2012, at 5:21 PM, Joseph Farran wrote:



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users





Re: [OMPI users] OpenMPI 1.6.3 and Memory Issues

2012-11-29 Thread Joseph Farran

Where do change those mellanox settings?

On 11/29/2012 02:23 PM, Jeff Squyres wrote:

See http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem.

On Nov 29, 2012, at 5:21 PM, Joseph Farran wrote:





Re: [OMPI users] OpenMPI 1.6.3 and Memory Issues

2012-11-29 Thread Jeff Squyres
See http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem.

On Nov 29, 2012, at 5:21 PM, Joseph Farran wrote:

> Hi All.
> 
> In compiling a simple Hello world with OpenMPI 1.6.3 and mpirun the hello 
> program, I am getting:
> 
> $ ulimit -l unlimited
> $ mpirun -np 2 hello
> --
> WARNING: It appears that your OpenFabrics subsystem is configured to only
> allow registering part of your physical memory.  This can cause MPI jobs to
> run with erratic performance, hang, and/or crash.
> 
> This may be caused by your OpenFabrics vendor limiting the amount of
> physical memory that can be registered.  You should investigate the
> relevant Linux kernel module parameters that control how much physical
> memory can be registered, and increase them to allow registering all
> physical memory on your machine.
> 
> See this Open MPI FAQ item for more information on these Linux kernel module
> parameters:
> 
>http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages
> 
>  Local host:  hpc
>  Registerable memory: 4096 MiB
>  Total memory:258470 MiB
> 
> Your MPI job will continue, but may be behave poorly and/or hang.
> --
> Hello World.   I am the Master Node (hpc) with Rank 0.
> Hello World.   I am compute Node (hpc) with Rank 1
> [hpc:08261] 1 more process has sent help message help-mpi-btl-openib.txt / 
> reg mem limit low
> [hpc:08261] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help 
> / error messages
> 
> 
> I have my limits setup with:
> cat /etc/security/limits.conf
> * soft memlock unlimited
> * hard memlock unlimited
> 
> What am I missing?
> 
> OS is CentOS 6.3.
> 
> Joseph
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] OpenMPI 1.6.3 and Memory Issues

2012-11-29 Thread Joseph Farran

Hi All.

In compiling a simple Hello world with OpenMPI 1.6.3 and mpirun the hello 
program, I am getting:

$ ulimit -l unlimited
$ mpirun -np 2 hello
--
WARNING: It appears that your OpenFabrics subsystem is configured to only
allow registering part of your physical memory.  This can cause MPI jobs to
run with erratic performance, hang, and/or crash.

This may be caused by your OpenFabrics vendor limiting the amount of
physical memory that can be registered.  You should investigate the
relevant Linux kernel module parameters that control how much physical
memory can be registered, and increase them to allow registering all
physical memory on your machine.

See this Open MPI FAQ item for more information on these Linux kernel module
parameters:

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

  Local host:  hpc
  Registerable memory: 4096 MiB
  Total memory:258470 MiB

Your MPI job will continue, but may be behave poorly and/or hang.
--
Hello World.   I am the Master Node (hpc) with Rank 0.
Hello World.   I am compute Node (hpc) with Rank 1
[hpc:08261] 1 more process has sent help message help-mpi-btl-openib.txt / reg 
mem limit low
[hpc:08261] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / 
error messages


I have my limits setup with:
cat /etc/security/limits.conf
* soft memlock unlimited
* hard memlock unlimited

What am I missing?

OS is CentOS 6.3.

Joseph