Re: [OMPI users] 回复: can you help me please ?thanks

2013-12-06 Thread Bruno Coutinho
Probably it was the changing from eager to rendezvous protocols as Jeff
said.

If you don't know what are these, read this:
https://computing.llnl.gov/tutorials/mpi_performance/#Protocols
http://blogs.cisco.com/performance/what-is-an-mpi-eager-limit/
http://blogs.cisco.com/performance/eager-limits-part-2/

You can tune eager limit chaning mca parameters btl_tcp_eager_limit (for
tcp), btl_self_eager_limit (comunication fron one process to itself),
btl_sm_eager_limit (shared memory) and btl_udapl_eager_limit or
btl_openib_eager_limit (if you use infiniband).


2013/12/6 Jeff Squyres (jsquyres) 

> I sent you some further questions yesterday:
>
> http://www.open-mpi.org/community/lists/users/2013/12/23158.php
>
>
> On Dec 6, 2013, at 1:35 AM, 胡杨 <781578...@qq.com> wrote:
>
> > Here is  my code:
> > int*a=(int*)malloc(sizeof(int)*number);
> > MPI_Send(a,number, MPI_INT, 1, 1,MPI_COMM_WORLD);
> >
> > int*b=(int*)malloc(sizeof(int)*number);
> > MPI_Recv(b, number, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, );
> >
> > number  here is the size of my array(eg,a or b).
> > I  have try it on my local compute and my rocks cluster.On rocks
> cluster, one processor  on  one frontend node  use "MPI_Send" send a
> message ,other processors on compute nodes use "MPI_Recv" receive message .
> > when number is least than 1,other processors can receive message
> fast;
> > but when  number is more than 15000,other processors can receive message
> slowly
> > why??  becesue openmpi API ?? or other  problems?
> >
> > it spends me a few days , I want your help,thanks for all readers. good
> luck for you
> >
> >
> >
> >
> > -- 原始邮件 --
> > 发件人: "Ralph Castain";;
> > 发送时间: 2013年12月5日(星期四) 晚上6:52
> > 收件人: "Open MPI Users";
> > 主题: Re: [OMPI users] can you help me please ?thanks
> >
> > You are running 15000 ranks on two nodes?? My best guess is that you are
> swapping like crazy as your memory footprint problem exceeds available
> physical memory.
> >
> >
> >
> > On Thu, Dec 5, 2013 at 1:04 AM, 胡杨 <781578...@qq.com> wrote:
> > My ROCKS cluster includes one frontend and two  compute nodes.In my
> program,I have use the openmpi API  such as  MPI_Send and  MPI_Recv .  but
>  when I  run  the progam with 3 processors . one processor  send a message
> ,other receive message .here are some code.
> > int*a=(int*)malloc(sizeof(int)*number);
> > MPI_Send(a,number, MPI_INT, 1, 1,MPI_COMM_WORLD);
> >
> > int*b=(int*)malloc(sizeof(int)*number);
> > MPI_Recv(b, number, MPI_INT, 0, MPI_ANY_TAG, MPI_COMM_WORLD, );
> >
> > when number is least than 1,it runs fast.
> > but number is more than 15000,it runs slowly
> >
> > why??  becesue openmpi API ?? or other  problems?
> > -- 原始邮件 --
> > 发件人: "Ralph Castain";;
> > 发送时间: 2013年12月3日(星期二) 中午1:39
> > 收件人: "Open MPI Users";
> > 主题: Re: [OMPI users] can you help me please ?thanks
> >
> >
> >
> >
> >
> > On Mon, Dec 2, 2013 at 9:23 PM, 胡杨 <781578...@qq.com> wrote:
> > A simple program at my 4-node ROCKS cluster runs fine with command:
> > /opt/openmpi/bin/mpirun -np 4 -machinefile machines ./sort_mpi6
> >
> >
> > Another bigger programs runs fine on the head node only with command:
> >
> > cd ./sphere; /opt/openmpi/bin/mpirun -np 4 ../bin/sort_mpi6
> >
> > But with the command:
> >
> > cd /sphere; /opt/openmpi/bin/mpirun -np 4 -machinefile ../machines
> > ../bin/sort_mpi6
> >
> > It gives output that:
> >
> > ../bin/sort_mpi6: error while loading shared libraries: libgdal.so.1:
> cannot open
> > shared object file: No such file or directory
> > ../bin/sort_mpi6: error while loading shared libraries: libgdal.so.1:
> cannot open
> > shared object file: No such file or directory
> > ../bin/sort_mpi6: error while loading shared libraries: libgdal.so.1:
> cannot open
> > shared object file: No such file or directory
> >
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] specifying hosts in mpi_spawn()

2008-05-30 Thread Bruno Coutinho
I'm using open mpi 1.2.6 from the open mpi site, but I can switch to another
version if necessary.


2008/5/30 Ralph H Castain <r...@lanl.gov>:

> I'm afraid I cannot answer that question without first knowing what version
> of Open MPI you are using. Could you provide that info?
>
> Thanks
> Ralph
>
>
>
> On 5/29/08 6:41 PM, "Bruno Coutinho" <couti...@dcc.ufmg.br> wrote:
>
> > How mpi handles the host string passed in the info argument to
> > mpi_comm_spawn() ?
> >
> > if I set host to:
> > "host1,host2,host3,host2,host2,host1"
> >
> > then ranks 0 and 5 will run in host1, ranks 1,3,4 in host 2 and rank 3
> > in host3?
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


[OMPI users] specifying hosts in mpi_spawn()

2008-05-29 Thread Bruno Coutinho
How mpi handles the host string passed in the info argument to
mpi_comm_spawn() ?

if I set host to:
"host1,host2,host3,host2,host2,host1"

then ranks 0 and 5 will run in host1, ranks 1,3,4 in host 2 and rank 3 in
host3?


Re: [OMPI users] MPI::Intracomm::Spawn and cluster configuration

2007-12-16 Thread Bruno Coutinho
Try using the info parameter in MPI::Intracomm::Spawn().
In this structure, you can say in which hosts you want to spawn.

Info parameters for MPI spawn:
http://www.mpi-forum.org/docs/mpi-20-html/node97.htm


2007/12/12, Elena Zhebel :
>
>  Hello,
>
> I'm working on a MPI application where I'm using OpenMPI instead of MPICH.
>
> In my "master" program I call the function MPI::Intracomm::Spawn which
> spawns "slave" processes. It is not clear for me how to spawn the "slave"
> processes over the network. Currently "master" creates "slaves" on the same
> host.
> If I use 'mpirun --hostfile openmpi.hosts' then processes are spawn over
> the network as expected. But now I need to spawn processes over the network
> from my own executable using MPI::Intracomm::Spawn, how can I achieve it?
>
> Thanks in advance for any help.
> Elena
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Too many open files Error

2007-11-01 Thread Bruno Coutinho
This page has information on how to increase the limit of open files.
Pass 1 and 3 don't require reboot.

http://www.cs.uwaterloo.ca/~brecht/servers/openfiles.html


2007/10/31, George Bosilca :
>
> For some version of Open MPI (recent versions) you can use the
> btl_tcp_disable_family MCA parameter to disable the IPv6 at runtime.
> Unfortunately, there is no similar option allowing you to disable IPv6
> for the runtime environment.
>
>george.
>
> On Oct 31, 2007, at 6:55 PM, Tim Prins wrote:
>
> > Hi Clement,
> >
> > I seem to recall (though this may have changed) that if a system
> > supports
> > ipv6, we may open both ipv4 and ipv6 sockets. This can be worked
> > around by
> > configuring Open MPI with --disable-ipv6
> >
> > Other then that, I don't know of anything else to do except raise
> > the limit
> > for the number of open files.
> >
> > I know it doesn't help you now, but we are actively working on this
> > problem
> > for Open MPI 1.3. This version will introduce a tree routing scheme
> > which
> > will dramatically reduce the number of open sockets that the runtime
> > system
> > needs.
> >
> > Hope this helps,
> >
> > Tim
> >
> > On Tuesday 30 October 2007 07:15:42 pm Clement Kam Man Chu wrote:
> >> Hi,
> >>
> >>  I got a "Too many open files" error while running over 1024
> >> processes
> >> on 512 cpus.  I found the same error on
> >> http://www.open-mpi.org/community/lists/users/2006/11/2216.php, but I
> >> would like to know whether it is another solution instead of changing
> >> limit descriptors.  The limit descriptors is changed by root access
> >> and
> >> needs to restart the system that I don't want to.
> >>
> >> Regards,
> >> Clement
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>