Re: [OMPI users] Allgather in inter-communicator bug,

2010-05-20 Thread Jeff Squyres
On May 20, 2010, at 2:52 PM, Lisandro Dalcin wrote:

> Jeff, you should really learn Python and give a try to mpi4py. Even if
> you do not consider Python a language for serious, production work
> :-), it would be a VERY productive one for writing tests targeting
> MPI.

Freely admitted laziness on my part (read: not enough cycles in the day to do 
what Cisco already pays me to do...).  :-(

> However, mpi4py have a BIG issue: not enough man-power for
> writing decent documentation.

Same issue here!  Maybe we should Google Wave it...  ;-)

> So you are suggesting my code could be buggy? No way ! ;-) . Slightly
> more serious: almost all my bug reports were discovered while
> unittesting mpi4py and getting failures when running with Open MPI, so
> I'm really confident about my Python bindings.

I can't tell you how much we appreciate these reports.

I know exactly the position you're in; I did the same thing years ago (ick!) 
with the is (was!) the MPI C++ bindings and with Object Oriented MPI (OOMPI).  
They were portable packages that ran on lots of different MPI's; their 
respective test suites found lots of problems in various MPI implementations.  
The LAM/MPI guys sent me a t-shirt for my efforts, which pretty much locked in 
my long slide into the deep, dark world of MPI implementers.  ;-)

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Allgather in inter-communicator bug,

2010-05-20 Thread Lisandro Dalcin
On 20 May 2010 11:09, Jeff Squyres  wrote:
> Can you send us an all-C or all-Fortran example that shows the problem?
>
> We don't have easy access to test through the python bindings.  ...ok, I 
> admit it, it's laziness on my part.  :-)
>

Jeff, you should really learn Python and give a try to mpi4py. Even if
you do not consider Python a language for serious, production work
:-), it would be a VERY productive one for writing tests targeting
MPI. However, mpi4py have a BIG issue: not enough man-power for
writing decent documentation.

>
> But having a pure Open MPI test app would also remove some possible variables 
> and possible sources of error.
>

So you are suggesting my code could be buggy? No way ! ;-) . Slightly
more serious: almost all my bug reports were discovered while
unittesting mpi4py and getting failures when running with Open MPI, so
I'm really confident about my Python bindings.


-- 
Lisandro Dalcin
---
CIMEC (INTEC/CONICET-UNL)
Predio CONICET-Santa Fe
Colectora RN 168 Km 472, Paraje El Pozo
Tel: +54-342-4511594 (ext 1011)
Tel/Fax: +54-342-4511169



Re: [OMPI users] Allgather in inter-communicator bug,

2010-05-20 Thread Jeff Squyres
Filed as https://svn.open-mpi.org/trac/ompi/ticket/2415.

Thanks for the bug report!


On May 20, 2010, at 1:33 PM, Edgar Gabriel wrote:

> thanks for pointing the problem out. I checked in the code, the problem
> is the MPI layer itself. The following check prevents us from doing
> anything
> 
> 
> e.g. ompi/mpi/c/allgather.c
> 
>if ((MPI_IN_PLACE != sendbuf && 0 == sendcount) ||
> (0 == recvcount)) {
> return MPI_SUCCESS;
> }
> 
> 
> 
> so the problem is not in the modules/algorithms but in the API layer,
> which did not encounter for intercommunicators correctly. I'll try to
> fix it.
> 
> Thanks
> edgar
> 
> On 05/20/2010 10:48 AM, Battalgazi YILDIRIM wrote:
> > Hi,
> >
> > you are right, I should have provided C++ and Fortran example, so I am
> > doing now
> >
> >
> > Here is "cplusplus.cpp"
> >
> > #include 
> > #include 
> > using namespace std;
> > int main()
> > {
> > MPI::Init();
> > char command[] = "./a.out";
> > MPI::Info info;
> > MPI::Intercomm child = MPI::COMM_WORLD.Spawn(command, NULL, 8,info, 0);
> > int a[8]={0,0,0,0,0,0,0,0};
> > int dummy;
> > child.Allgather(, 0, MPI::INT, a, 1, MPI::INT);
> > child.Disconnect();
> > cout << "a[";
> > for ( int i = 0; i < 7; i++ )
> > cout << a[i] << ",";
> > cout << a[7] << "]" << endl;
> >
> > MPI::Finalize();
> > }
> >
> >
> > Here is again "fortran.f90"
> >
> > program main
> >  use mpi
> >  implicit none
> >  integer :: parent, rank, val, dummy, ierr
> >  call MPI_Init(ierr)
> >  call MPI_Comm_get_parent(parent, ierr)
> >  call MPI_Comm_rank(parent, rank, ierr)
> >  val = rank + 1
> >  call MPI_Allgather(val,   1, MPI_INTEGER, &
> > dummy, 0, MPI_INTEGER, &
> > parent, ierr)
> >  call MPI_Comm_disconnect(parent, ierr)
> >  call MPI_Finalize(ierr)
> > end program main
> >
> > here is how you build and run
> >
> > -bash-3.2$ mpif90 fortran.f90
> > -bash-3.2$ mpiCC -o parent cplusplus.cpp
> > -bash-3.2$ ./parent
> > a[0,0,0,0,0,0,0,0]
> >
> >
> >
> > If I use mpich2,
> > -bash-3.2$ mpif90 fortran.f90
> > -bash-3.2$ mpiCC -o parent cplusplus.cpp
> > -bash-3.2$ ./parent
> > a[1,2,3,4,5,6,7,8]
> >
> > I hope that you can repeat this problem to see problem with OPENMPI,
> >
> > Thanks,
> >
> >
> > On Thu, May 20, 2010 at 10:09 AM, Jeff Squyres  > > wrote:
> >
> > Can you send us an all-C or all-Fortran example that shows the problem?
> >
> > We don't have easy access to test through the python bindings.
> >  ...ok, I admit it, it's laziness on my part.  :-)  But having a
> > pure Open MPI test app would also remove some possible variables and
> > possible sources of error.
> >
> >
> > On May 20, 2010, at 9:43 AM, Battalgazi YILDIRIM wrote:
> >
> > > Hi Jody,
> > >
> > > I think that it is correct, you can  test this example in your
> > desktop,
> > >
> > > thanks,
> > >
> > > On Thu, May 20, 2010 at 3:18 AM, jody  > > wrote:
> > > Hi
> > > I am really no python expert, but it looks to me as if you were
> > > gathering arrays filled with zeroes:
> > >  a = array('i', [0]) * n
> > >
> > > Shouldn't this line be
> > >  a = array('i', [r])*n
> > > where r is the rank of the process?
> > >
> > > Jody
> > >
> > >
> > > On Thu, May 20, 2010 at 12:00 AM, Battalgazi YILDIRIM
> > > > wrote:
> > > > Hi,
> > > >
> > > >
> > > > I am trying to use intercommunicator ::Allgather between two
> > child process.
> > > > I have fortran and Python code,
> > > > I am using mpi4py for python. It seems that ::Allgather is not
> > working
> > > > properly in my desktop.
> > > >
> > > >  I have contacted first mpi4py developers (Lisandro Dalcin), he
> > simplified
> > > > my problem and provided two example files (python.py and
> > fortran.f90,
> > > > please see below).
> > > >
> > > > We tried with different MPI vendors, the following example
> > worked correclty(
> > > > it means the final print out should be array('i', [1, 2, 3, 4,
> > 5, 6, 7, 8])
> > > > )
> > > >
> > > > However, it is not giving correct answer in my two desktop
> > (Redhat and
> > > > ubuntu) both
> > > > using OPENMPI
> > > >
> > > > Could yo look at this problem please?
> > > >
> > > > If you want to follow our discussion before you, you can go to
> > following
> > > > link:
> > > >
> > 
> > http://groups.google.com/group/mpi4py/browse_thread/thread/c17c660ae56ff97e
> > > >
> > > > yildirim@memosa:~/python_intercomm$ more python.py
> > > > from mpi4py import MPI
> > > > from array import array
> > > > import os
> > > >
> > > > progr = 

Re: [OMPI users] Allgather in inter-communicator bug,

2010-05-20 Thread Edgar Gabriel
thanks for pointing the problem out. I checked in the code, the problem
is the MPI layer itself. The following check prevents us from doing
anything


e.g. ompi/mpi/c/allgather.c

   if ((MPI_IN_PLACE != sendbuf && 0 == sendcount) ||
(0 == recvcount)) {
return MPI_SUCCESS;
}



so the problem is not in the modules/algorithms but in the API layer,
which did not encounter for intercommunicators correctly. I'll try to
fix it.

Thanks
edgar

On 05/20/2010 10:48 AM, Battalgazi YILDIRIM wrote:
> Hi,
> 
> you are right, I should have provided C++ and Fortran example, so I am
> doing now
> 
> 
> Here is "cplusplus.cpp"
> 
> #include 
> #include 
> using namespace std;
> int main()
> {
> MPI::Init();
> char command[] = "./a.out";
> MPI::Info info;
> MPI::Intercomm child = MPI::COMM_WORLD.Spawn(command, NULL, 8,info, 0);
> int a[8]={0,0,0,0,0,0,0,0};
> int dummy;
> child.Allgather(, 0, MPI::INT, a, 1, MPI::INT);
> child.Disconnect();
> cout << "a[";
> for ( int i = 0; i < 7; i++ )
> cout << a[i] << ",";
> cout << a[7] << "]" << endl;
> 
> MPI::Finalize();
> }
> 
> 
> Here is again "fortran.f90"
> 
> program main
>  use mpi
>  implicit none
>  integer :: parent, rank, val, dummy, ierr
>  call MPI_Init(ierr)
>  call MPI_Comm_get_parent(parent, ierr)
>  call MPI_Comm_rank(parent, rank, ierr)
>  val = rank + 1
>  call MPI_Allgather(val,   1, MPI_INTEGER, &
> dummy, 0, MPI_INTEGER, &
> parent, ierr)
>  call MPI_Comm_disconnect(parent, ierr)
>  call MPI_Finalize(ierr)
> end program main
> 
> here is how you build and run
> 
> -bash-3.2$ mpif90 fortran.f90
> -bash-3.2$ mpiCC -o parent cplusplus.cpp
> -bash-3.2$ ./parent
> a[0,0,0,0,0,0,0,0]
> 
> 
> 
> If I use mpich2,
> -bash-3.2$ mpif90 fortran.f90
> -bash-3.2$ mpiCC -o parent cplusplus.cpp
> -bash-3.2$ ./parent
> a[1,2,3,4,5,6,7,8]
> 
> I hope that you can repeat this problem to see problem with OPENMPI,
> 
> Thanks,
> 
> 
> On Thu, May 20, 2010 at 10:09 AM, Jeff Squyres  > wrote:
> 
> Can you send us an all-C or all-Fortran example that shows the problem?
> 
> We don't have easy access to test through the python bindings.
>  ...ok, I admit it, it's laziness on my part.  :-)  But having a
> pure Open MPI test app would also remove some possible variables and
> possible sources of error.
> 
> 
> On May 20, 2010, at 9:43 AM, Battalgazi YILDIRIM wrote:
> 
> > Hi Jody,
> >
> > I think that it is correct, you can  test this example in your
> desktop,
> >
> > thanks,
> >
> > On Thu, May 20, 2010 at 3:18 AM, jody  > wrote:
> > Hi
> > I am really no python expert, but it looks to me as if you were
> > gathering arrays filled with zeroes:
> >  a = array('i', [0]) * n
> >
> > Shouldn't this line be
> >  a = array('i', [r])*n
> > where r is the rank of the process?
> >
> > Jody
> >
> >
> > On Thu, May 20, 2010 at 12:00 AM, Battalgazi YILDIRIM
> > > wrote:
> > > Hi,
> > >
> > >
> > > I am trying to use intercommunicator ::Allgather between two
> child process.
> > > I have fortran and Python code,
> > > I am using mpi4py for python. It seems that ::Allgather is not
> working
> > > properly in my desktop.
> > >
> > >  I have contacted first mpi4py developers (Lisandro Dalcin), he
> simplified
> > > my problem and provided two example files (python.py and
> fortran.f90,
> > > please see below).
> > >
> > > We tried with different MPI vendors, the following example
> worked correclty(
> > > it means the final print out should be array('i', [1, 2, 3, 4,
> 5, 6, 7, 8])
> > > )
> > >
> > > However, it is not giving correct answer in my two desktop
> (Redhat and
> > > ubuntu) both
> > > using OPENMPI
> > >
> > > Could yo look at this problem please?
> > >
> > > If you want to follow our discussion before you, you can go to
> following
> > > link:
> > >
> 
> http://groups.google.com/group/mpi4py/browse_thread/thread/c17c660ae56ff97e
> > >
> > > yildirim@memosa:~/python_intercomm$ more python.py
> > > from mpi4py import MPI
> > > from array import array
> > > import os
> > >
> > > progr = os.path.abspath('a.out')
> > > child = MPI.COMM_WORLD.Spawn(progr,[], 8)
> > > n = child.remote_size
> > > a = array('i', [0]) * n
> > > child.Allgather([None,MPI.INT ],[a,MPI.INT
> ])
> > > child.Disconnect()
> > > print a
> > >
> > > yildirim@memosa:~/python_intercomm$ more fortran.f90
> > > program main
> > >  use mpi
> > >  implicit none
> > >  integer :: parent, rank, val, 

Re: [OMPI users] Buffer size limit and memory consumption problem on heterogeneous (32 bit / 64 bit) machines

2010-05-20 Thread Olivier Riff
I have done the test with v1.4.2 and indeed it fixes the problem.
Thanks Nysal.
Thank you also Terry for your help. With the fix I do not need anymore to
use a huge value of btl_tcp_eager_limit (I keep the default value) which
considerably decreases the memory consumption I had before. Everything works
fine now.

Regards,

Olivier

2010/5/20 Olivier Riff 

>
>
> 2010/5/20 Nysal Jan 
>
> This probably got fixed in https://svn.open-mpi.org/trac/ompi/ticket/2386
>> Can you try 1.4.2, the fix should be in there.
>>
>>
>
> I will test it soon (takes some time to install the new version on each
> node) . It would be perfect if it fixes it.
> I will tell you the result asap.
>
> Thanks.
>
> Olivier
>
>
>
>
>
>
>
>> Regards
>> --Nysal
>>
>>
>> On Thu, May 20, 2010 at 2:02 PM, Olivier Riff wrote:
>>
>>> Hello,
>>>
>>> I assume this question has been already discussed many times, but I can
>>> not find on Internet a solution to my problem.
>>> It is about buffer size limit of MPI_Send and MPI_Recv with heterogeneous
>>> system (32 bit laptop / 64 bit cluster).
>>> My configuration is :
>>> open mpi 1.4, configured with: --without-openib --enable-heterogeneous
>>> --enable-mpi-threads
>>> Program is launched a laptop (32 bit Mandriva 2008) which distributes
>>> tasks to do to a cluster of 70 processors  (64 bit RedHat Entreprise
>>> distribution):
>>> I have to send various buffer size from few bytes till 30Mo.
>>>
>>> I tested following commands:
>>> 1) mpirun -v -machinefile machinefile.txt MyMPIProgram
>>> -> crash on client side ( 64 bit RedHat Entreprise ) when sent buffer
>>> size > 65536.
>>> 2) mpirun --mca btl_tcp_eager_limit 3000 -v -machinefile
>>> machinefile.txt MyMPIProgram
>>> -> works but has the effect of generating gigantic memory consumption on
>>> 32 bit machine side after MPI_Recv. Memory consumption goes from 800Mo to
>>> 2,1Go after receiving about 20ko from each 70 clients ( a total of about 1.4
>>> Mo ).  This makes my program crash later because I have no more memory to
>>> allocate new structures. I read in a openmpi forum thread that setting
>>> btl_tcp_eager_limit to a huge value explains this huge memory consumption
>>> when a message sent does not have a preposted ready recv. Also after all
>>> messages have been received and there is no more traffic activity : the
>>> memory consumed remains at 2.1go... and I do not understand why.
>>>
>>> What is the best way to do in order to have a working program which also
>>> has a small memory consumption (the speed performance can be lower) ?
>>> I tried to play with mca paramters btl_tcp_sndbuf and mca btl_tcp_rcvbuf,
>>> but without success.
>>>
>>> Thanks in advance for you help.
>>>
>>> Best regards,
>>>
>>> Olivier
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>


Re: [OMPI users] General question on the implementation of a"scheduler" on client side...

2010-05-20 Thread Jeff Squyres
You're basically talking about implementing some kind of application-specific 
protocol.  A few tips that may help in your design:

1. Look into MPI_Isend / MPI_Irecv for non-blocking sends and receives.  These 
may be particularly useful in the server side, so that it can do other stuff 
while sends and receives are progressing.

2. You probably already noticed that collective operations (broadcasts and the 
link) need to be invoked by all members of the communicator.  So if you want to 
do a broadcast, everyone needs to know.  That being said, you can send a short 
message to everyone alerting them that a longer broadcast is coming -- then 
they can execute MPI_BCAST, etc.  That works best if your broadcasts are large 
messages (i.e., you benefit from scalable implementations of broadcast) -- 
otherwise you're individually sending short messages followed by a short 
broadcast.  There might not be much of a "win" there.

3. FWIW, the MPI Forum has introduced the concept of non-blocking collective 
operations for the upcoming MPI-3 spec.  These may help; google for libnbc for 
a (non-optimized) implementation that may be of help to you.  MPI 
implementations (like Open MPI) will feature non-blocking collectives someday 
in the future.


On May 20, 2010, at 5:30 AM, Olivier Riff wrote:

> Hello,
> 
> I have a general question about the best way to implement an openmpi 
> application, i.e the design of the application.
> 
> A machine (I call it the "server") should send to a cluster containing a lot 
> of processors (the "clients") regularly task to do (byte buffers from very 
> various size).
> The server should send to each client a different buffer, then wait for each 
> client answers (buffer sent by each client after some processing), and 
> retrieve the result data.
> 
> First I made something looking like this.
> On the server side: Send sequentially to each client buffers using MPI_Send.
> On each client side: loop which waits a buffer using MPI_Recv, then process 
> the buffer and sends the result using MPI_Send
> This is really not efficient because a lot of time is lost due to the fact 
> that the server sends and receives sequentially the buffers.
> It only has the advantage to have on the client size a pretty easy scheduler: 
> Wait for buffer (MPI_Recv) -> Analyse it -> Send result (MPI_Send)
> 
> My wish is to mix MPI_Send/MPI_Recv and other mpi functions like 
> MPI_BCast/MPI_Scatter/MPI_Gather... (like I imagine every mpi application 
> does).
> The problem is that I cannot find a easy solution in order that each client 
> knows which kind of mpi function is currently called by the server. If the 
> server calls MPI_BCast the client should do the same. Sending at each time a 
> first message to indicate the function the server will call next does not 
> look very nice. Though I do not see an easy/best way to implement an 
> "adaptative" scheduler on the client side.
> 
> Any tip, advice, help would be appreciate.
> 
> 
> Thanks,
> 
> Olivier
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Allgather in inter-communicator bug,

2010-05-20 Thread Battalgazi YILDIRIM
Hi,

you are right, I should have provided C++ and Fortran example, so I am doing
now


Here is "cplusplus.cpp"

#include 
#include 
using namespace std;
int main()
{
MPI::Init();
char command[] = "./a.out";
MPI::Info info;
MPI::Intercomm child = MPI::COMM_WORLD.Spawn(command, NULL, 8,info, 0);
int a[8]={0,0,0,0,0,0,0,0};
int dummy;
child.Allgather(, 0, MPI::INT, a, 1, MPI::INT);
child.Disconnect();
cout << "a[";
for ( int i = 0; i < 7; i++ )
cout << a[i] << ",";
cout << a[7] << "]" << endl;

MPI::Finalize();
}


Here is again "fortran.f90"

program main
 use mpi
 implicit none
 integer :: parent, rank, val, dummy, ierr
 call MPI_Init(ierr)
 call MPI_Comm_get_parent(parent, ierr)
 call MPI_Comm_rank(parent, rank, ierr)
 val = rank + 1
 call MPI_Allgather(val,   1, MPI_INTEGER, &
dummy, 0, MPI_INTEGER, &
parent, ierr)
 call MPI_Comm_disconnect(parent, ierr)
 call MPI_Finalize(ierr)
end program main

here is how you build and run

-bash-3.2$ mpif90 fortran.f90
-bash-3.2$ mpiCC -o parent cplusplus.cpp
-bash-3.2$ ./parent
a[0,0,0,0,0,0,0,0]



If I use mpich2,
-bash-3.2$ mpif90 fortran.f90
-bash-3.2$ mpiCC -o parent cplusplus.cpp
-bash-3.2$ ./parent
a[1,2,3,4,5,6,7,8]

I hope that you can repeat this problem to see problem with OPENMPI,

Thanks,


On Thu, May 20, 2010 at 10:09 AM, Jeff Squyres  wrote:

> Can you send us an all-C or all-Fortran example that shows the problem?
>
> We don't have easy access to test through the python bindings.  ...ok, I
> admit it, it's laziness on my part.  :-)  But having a pure Open MPI test
> app would also remove some possible variables and possible sources of error.
>
>
> On May 20, 2010, at 9:43 AM, Battalgazi YILDIRIM wrote:
>
> > Hi Jody,
> >
> > I think that it is correct, you can  test this example in your desktop,
> >
> > thanks,
> >
> > On Thu, May 20, 2010 at 3:18 AM, jody  wrote:
> > Hi
> > I am really no python expert, but it looks to me as if you were
> > gathering arrays filled with zeroes:
> >  a = array('i', [0]) * n
> >
> > Shouldn't this line be
> >  a = array('i', [r])*n
> > where r is the rank of the process?
> >
> > Jody
> >
> >
> > On Thu, May 20, 2010 at 12:00 AM, Battalgazi YILDIRIM
> >  wrote:
> > > Hi,
> > >
> > >
> > > I am trying to use intercommunicator ::Allgather between two child
> process.
> > > I have fortran and Python code,
> > > I am using mpi4py for python. It seems that ::Allgather is not working
> > > properly in my desktop.
> > >
> > >  I have contacted first mpi4py developers (Lisandro Dalcin), he
> simplified
> > > my problem and provided two example files (python.py and fortran.f90,
> > > please see below).
> > >
> > > We tried with different MPI vendors, the following example worked
> correclty(
> > > it means the final print out should be array('i', [1, 2, 3, 4, 5, 6, 7,
> 8])
> > > )
> > >
> > > However, it is not giving correct answer in my two desktop (Redhat and
> > > ubuntu) both
> > > using OPENMPI
> > >
> > > Could yo look at this problem please?
> > >
> > > If you want to follow our discussion before you, you can go to
> following
> > > link:
> > >
> http://groups.google.com/group/mpi4py/browse_thread/thread/c17c660ae56ff97e
> > >
> > > yildirim@memosa:~/python_intercomm$ more python.py
> > > from mpi4py import MPI
> > > from array import array
> > > import os
> > >
> > > progr = os.path.abspath('a.out')
> > > child = MPI.COMM_WORLD.Spawn(progr,[], 8)
> > > n = child.remote_size
> > > a = array('i', [0]) * n
> > > child.Allgather([None,MPI.INT],[a,MPI.INT])
> > > child.Disconnect()
> > > print a
> > >
> > > yildirim@memosa:~/python_intercomm$ more fortran.f90
> > > program main
> > >  use mpi
> > >  implicit none
> > >  integer :: parent, rank, val, dummy, ierr
> > >  call MPI_Init(ierr)
> > >  call MPI_Comm_get_parent(parent, ierr)
> > >  call MPI_Comm_rank(parent, rank, ierr)
> > >  val = rank + 1
> > >  call MPI_Allgather(val,   1, MPI_INTEGER, &
> > > dummy, 0, MPI_INTEGER, &
> > > parent, ierr)
> > >  call MPI_Comm_disconnect(parent, ierr)
> > >  call MPI_Finalize(ierr)
> > > end program main
> > >
> > > yildirim@memosa:~/python_intercomm$ mpif90 fortran.f90
> > >
> > > yildirim@memosa:~/python_intercomm$ python python.py
> > > array('i', [0, 0, 0, 0, 0, 0, 0, 0])
> > >
> > >
> > > --
> > > B. Gazi YILDIRIM
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> >
> > --
> > B. Gazi YILDIRIM
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Re: [OMPI users] Segmentation fault at program end with 2+ processes

2010-05-20 Thread Prentice Bisbal
I hope I'm not too late in my reply, and I hope I'm not repeating the
same solution others have given you.

I had a similar error in a code a few months ago. The error was this: I
think I was doing an MPI_Pack/Unpack to send data between nodes. The
problem was that I was allocating space for a buffer using the wrong
variable, so there was a buffer size mismatch between the sending and
receiving nodes.

When running problem as a single instance, these buffers weren't really
being used, so the problem never presented itself. It trickier, the
problem only occurred when the payload exceeded a certain size (number
of elements in array, or data in packed buffer) when run in parallel.

I used valgrind, which didn't shed much light on the problem. I finally
found my error when I tracking down the data size dependency.

I hope that helps.

Prentice


Jeff Squyres wrote:
> Ouch.  These are the worst kinds of bugs to find.  :-(
> 
> If you attach a debugger to these processes and step through the final death 
> throes of the process, does it provide any additional insight?  I have not 
> infrequently done stuff like this:
> 
>   {
>  int i = 0;
>  printf("Process %d ready to attach\n", getpid());
>  while (i == 0) sleep(5);
>   }
> 
> Then you get a message indicating which pid to attach to.  When you attach, 
> set the variable i to nonzero and you can continue stepping through the 
> process.
> 
> 
> 
> On May 14, 2010, at 10:44 AM, Paul-Michael Agapow wrote:
> 
>> Apologies for the vague details of the problem I'm about to describe,
>> but then I only understand it vaguely. Any pointers about the best
>> directions for further investigation would be appreciated. Lengthy
>> details follow:
>>
>> So I'm "MPI-izing" a pre-existing C++ program (not mine) and have run
>> into some weird behaviour. When run under mpiexec, a segmentation
>> fault is thrown:
>>
>> % mpiexec -n 2 ./omegamip
>> [...]
>> main.cpp:52: Finished.
>> Completed 20 of 20 in 0.0695 minutes
>> [queen:23560] *** Process received signal ***
>> [queen:23560] Signal: Segmentation fault (11)
>> [queen:23560] Signal code:  (128)
>> [queen:23560] Failing at address: (nil)
>> [queen:23560] [ 0] /lib64/libpthread.so.0 [0x3d6a00de80]
>> [queen:23560] [ 1] /opt/openmpi/lib/libopen-pal.so.0(_int_free+0x40)
>> [0x2afb1fa43460]
>> [queen:23560] [ 2] /opt/openmpi/lib/libopen-pal.so.0(free+0xbd) 
>> [0x2afb1fa439ad]
>> [queen:23560] [ 3] ./omegamip(_ZN12omegaMapBaseD2Ev+0x5b) [0x433c2b]
>> [queen:23560] [ 4] ./omegamip(main+0x18c) [0x415ccc]
>> [queen:23560] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4) [0x3d6941d8b4]
>> [queen:23560] [ 6] ./omegamip(__gxx_personality_v0+0x1e9) [0x40ee59]
>> [queen:23560] *** End of error message ***
>> mpiexec noticed that job rank 1 with PID 23560 on node
>> queen.bioinformatics exited on signal 11 (Segmentation fault).
>>
>> Right, so I've got a memory overrun or something. Except that when the
>> program is run in standalone mode, it works fine:
>>
>> % ./omegamip
>> [...]
>> main.cpp:52: Finished.
>> Completed 20 of 20 in 0.05970 minutes
>>
>> Right, so there's a difference between my standalone and MPI modes.
>> Except the the difference between my standalone and MPI versions is
>> currently nothing but the calls to MPI_Init, MPI_Finalize and some
>> exploratory calls to MPI_Comm_size and MPI_Comm_rank. (I haven't
>> gotten as far as coding the problem division.) Also, calling mpiexec
>> with 1 process always works:
>>
>> % mpiexec -n 1 ./omegamip
>> [...]
>> main.cpp:52: Finished.
>> Completed 20 of 20 in 0.05801 minutes
>>
>> So there's still this segmentation fault. Running valgrind across the
>> program doesn't show any obvious problems: there was some quirky
>> pointer arithmetic and some huge blocks of dangling memory, but these
>> were only leaked at the end of the program (i.e. the original
>> programmer didn't bother cleaning up at program termination). I've
>> caught most of those. But the segmentation fault still occurs only
>> when run under mpiexec with 2 or more processes. And by use of
>> diagnostic printfs and logging, I can see that it only occurs at the
>> very end of the program, the very end of main, possibly when
>> destructors are being automatically called. But again this cleanup
>> doesn't cause any problems with the standalone or 1 process modes.
>>
>> So, any ideas for where to start looking?
>>
>> technical details: gcc v4.1.2, C++, mpiexec (OpenRTE) 1.2.7, x86_64,
>> Red Hat 4.1.2-42
>>
>> 
>> Paul-Michael Agapow (paul-michael.agapow (at) hpa.org.uk)
>> Bioinformatics, Centre for Infections, Health Protection Agency
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
> 
> 

-- 
Prentice Bisbal
Linux Software Support Specialist/System Administrator
School of Natural Sciences
Institute for Advanced Study
Princeton, NJ


Re: [OMPI users] GM + OpenMPI bug ...

2010-05-20 Thread Patrick Geoffray

Hi Jose,

On 5/12/2010 10:57 PM, Jos? Ignacio Aliaga Estell?s wrote:

I think that I have found a bug on the implementation of GM collectives
routines included in OpenMPI. The version of the GM software is 2.0.30
for the PCI64 cards.



I obtain the same problems when I use the 1.4.1 or the 1.4.2 version.
Could you help me? Thanks.


We have been running the test you provided on 8 nodes for 4 hours and 
haven't seen any errors. The setup used GM 2.0.30 and openmpi 1.4.2 on 
PCI-X cards (M3F-PCIXD-2 aka 'D' cards). We do not have PCI64 NICs 
anymore, and no machines with a PCI 64/66 slot.


One-bit errors are rarely a software problem, they are usually linked to 
hardware corruption. Old PCI has a simple parity check but most 
machines/BIOS of this era ignored reported errors. You may want to check 
the lspci output on your machines and see if SERR or PERR is set. You 
can also try to reset each NIC in its PCI slot, or use a different slot 
if available.


Hope it helps.

Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com


Re: [OMPI users] Allgather in inter-communicator bug,

2010-05-20 Thread Jeff Squyres
Can you send us an all-C or all-Fortran example that shows the problem?

We don't have easy access to test through the python bindings.  ...ok, I admit 
it, it's laziness on my part.  :-)  But having a pure Open MPI test app would 
also remove some possible variables and possible sources of error.


On May 20, 2010, at 9:43 AM, Battalgazi YILDIRIM wrote:

> Hi Jody,
> 
> I think that it is correct, you can  test this example in your desktop,
> 
> thanks,
> 
> On Thu, May 20, 2010 at 3:18 AM, jody  wrote:
> Hi
> I am really no python expert, but it looks to me as if you were
> gathering arrays filled with zeroes:
>  a = array('i', [0]) * n
> 
> Shouldn't this line be
>  a = array('i', [r])*n
> where r is the rank of the process?
> 
> Jody
> 
> 
> On Thu, May 20, 2010 at 12:00 AM, Battalgazi YILDIRIM
>  wrote:
> > Hi,
> >
> >
> > I am trying to use intercommunicator ::Allgather between two child process.
> > I have fortran and Python code,
> > I am using mpi4py for python. It seems that ::Allgather is not working
> > properly in my desktop.
> >
> >  I have contacted first mpi4py developers (Lisandro Dalcin), he simplified
> > my problem and provided two example files (python.py and fortran.f90,
> > please see below).
> >
> > We tried with different MPI vendors, the following example worked correclty(
> > it means the final print out should be array('i', [1, 2, 3, 4, 5, 6, 7, 8])
> > )
> >
> > However, it is not giving correct answer in my two desktop (Redhat and
> > ubuntu) both
> > using OPENMPI
> >
> > Could yo look at this problem please?
> >
> > If you want to follow our discussion before you, you can go to following
> > link:
> > http://groups.google.com/group/mpi4py/browse_thread/thread/c17c660ae56ff97e
> >
> > yildirim@memosa:~/python_intercomm$ more python.py
> > from mpi4py import MPI
> > from array import array
> > import os
> >
> > progr = os.path.abspath('a.out')
> > child = MPI.COMM_WORLD.Spawn(progr,[], 8)
> > n = child.remote_size
> > a = array('i', [0]) * n
> > child.Allgather([None,MPI.INT],[a,MPI.INT])
> > child.Disconnect()
> > print a
> >
> > yildirim@memosa:~/python_intercomm$ more fortran.f90
> > program main
> >  use mpi
> >  implicit none
> >  integer :: parent, rank, val, dummy, ierr
> >  call MPI_Init(ierr)
> >  call MPI_Comm_get_parent(parent, ierr)
> >  call MPI_Comm_rank(parent, rank, ierr)
> >  val = rank + 1
> >  call MPI_Allgather(val,   1, MPI_INTEGER, &
> > dummy, 0, MPI_INTEGER, &
> > parent, ierr)
> >  call MPI_Comm_disconnect(parent, ierr)
> >  call MPI_Finalize(ierr)
> > end program main
> >
> > yildirim@memosa:~/python_intercomm$ mpif90 fortran.f90
> >
> > yildirim@memosa:~/python_intercomm$ python python.py
> > array('i', [0, 0, 0, 0, 0, 0, 0, 0])
> >
> >
> > --
> > B. Gazi YILDIRIM
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> B. Gazi YILDIRIM
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Allgather in inter-communicator bug,

2010-05-20 Thread Battalgazi YILDIRIM
Hi Jody,

I think that it is correct, you can  test this example in your desktop,

thanks,

On Thu, May 20, 2010 at 3:18 AM, jody  wrote:

> Hi
> I am really no python expert, but it looks to me as if you were
> gathering arrays filled with zeroes:
>   a = array('i', [0]) * n
>
> Shouldn't this line be
>  a = array('i', [r])*n
> where r is the rank of the process?
>
> Jody
>
>
> On Thu, May 20, 2010 at 12:00 AM, Battalgazi YILDIRIM
>  wrote:
> > Hi,
> >
> >
> > I am trying to use intercommunicator ::Allgather between two child
> process.
> > I have fortran and Python code,
> > I am using mpi4py for python. It seems that ::Allgather is not working
> > properly in my desktop.
> >
> >  I have contacted first mpi4py developers (Lisandro Dalcin), he
> simplified
> > my problem and provided two example files (python.py and fortran.f90,
> > please see below).
> >
> > We tried with different MPI vendors, the following example worked
> correclty(
> > it means the final print out should be array('i', [1, 2, 3, 4, 5, 6, 7,
> 8])
> > )
> >
> > However, it is not giving correct answer in my two desktop (Redhat and
> > ubuntu) both
> > using OPENMPI
> >
> > Could yo look at this problem please?
> >
> > If you want to follow our discussion before you, you can go to following
> > link:
> >
> http://groups.google.com/group/mpi4py/browse_thread/thread/c17c660ae56ff97e
> >
> > yildirim@memosa:~/python_intercomm$ more python.py
> > from mpi4py import MPI
> > from array import array
> > import os
> >
> > progr = os.path.abspath('a.out')
> > child = MPI.COMM_WORLD.Spawn(progr,[], 8)
> > n = child.remote_size
> > a = array('i', [0]) * n
> > child.Allgather([None,MPI.INT],[a,MPI.INT])
> > child.Disconnect()
> > print a
> >
> > yildirim@memosa:~/python_intercomm$ more fortran.f90
> > program main
> >  use mpi
> >  implicit none
> >  integer :: parent, rank, val, dummy, ierr
> >  call MPI_Init(ierr)
> >  call MPI_Comm_get_parent(parent, ierr)
> >  call MPI_Comm_rank(parent, rank, ierr)
> >  val = rank + 1
> >  call MPI_Allgather(val,   1, MPI_INTEGER, &
> > dummy, 0, MPI_INTEGER, &
> > parent, ierr)
> >  call MPI_Comm_disconnect(parent, ierr)
> >  call MPI_Finalize(ierr)
> > end program main
> >
> > yildirim@memosa:~/python_intercomm$ mpif90 fortran.f90
> >
> > yildirim@memosa:~/python_intercomm$ python python.py
> > array('i', [0, 0, 0, 0, 0, 0, 0, 0])
> >
> >
> > --
> > B. Gazi YILDIRIM
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
B. Gazi YILDIRIM


Re: [OMPI users] How to show outputs from MPI program that runs on a cluster?

2010-05-20 Thread Sang Chul Choi
Thank you!

Sang Chul

On May 20, 2010, at 2:39 AM, jody wrote:

> Hi
> mpirun has an option for this (check the mpirun man page):
> 
>   -tag-output, --tag-output
>  Tag  each  line  of  output to stdout, stderr, and
> stddiag with [jobid, rank] indicating the process jobid and
> rank that generated the
>  output, and the channel which generated it.
> 
> Using this you can filter the entire output by grepping for the required rank.
> 
> Another possibility is to use the option
>   -xterm, --xterm 
>  Display the specified ranks in separate xterm windows.
> The ranks are specified as a comma-separated list of ranges, with a -1
> indicating  all.
>  A separate window will be created for each specified
> rank.  Note: In some environments, xterm may require that the
> executable be in the user’s
>  path, or be specified in absolute or relative terms.
> Thus, it may be necessary to specify a local executable as "./foo"
> instead of just "foo".
>  If xterm fails to find the executable, mpirun will hang,
> but still respond correctly to a ctrl-c.  If this happens, please
> check that the exe-
>  cutable is being specified correctly and try again.
> 
> That way you can open a single terminal window for the process you are
> interested in.
> 
> 
> Jody
> 
> 
> On Thu, May 20, 2010 at 1:28 AM, Sang Chul Choi  wrote:
>> Hi,
>> 
>> I am wondering if there is a way to run a particular process among multiple 
>> processes on the console of a linux cluster.
>> 
>> I want to see the screen output (standard output) of a particular process 
>> (using a particular ID of a process) on the console screen while the MPI 
>> program is running.  I think that if I run a MPI program on a linux cluster 
>> using Sun Grid Engine, the particular process that prints out to standard 
>> output could run on the console or computing node.   And, it would be hard 
>> to see screen output of the particular process.  Is there a way to to set 
>> one process aside and to run it on the console in Sun Grid Engine?
>> 
>> When I run the MPI program on my desktop with quad cores, I can set aside 
>> one process using an ID to print information that I need.  I do not know how 
>> I could do that in much larger scale like using Sun Grid Engine.  I could 
>> let one process print out in a file and then I could see it.  I do not know 
>> how I could let one process to print out on the console screen by setting it 
>> to run on the console using Sun Grid Engine or any other similar thing such 
>> as PBS.  I doubt that a cluster would allow jobs to run on the console 
>> because then others users would have to be in trouble in submitting jobs.  
>> If this is the case, there seem no way to print out on the console.   Then, 
>> do I have to have a separate (non-MPI) program that can communicate with MPI 
>> program using TCP/IP by running the separate program on the master node of a 
>> cluster?  This separate non-MPI program may then communicate sporadically 
>> with the MPI program.  I do not know if this is a general approach or a 
>> peculiar way.
>> 
>> I will appreciate any of input.
>> 
>> Thank you,
>> 
>> Sang Chul
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Rép : openmpi + share points

2010-05-20 Thread Jeff Squyres
I replied to this yesterday:

http://www.open-mpi.org/community/lists/users/2010/05/13090.php



On May 20, 2010, at 8:13 AM, Christophe Peyret wrote:

> Hello,
> 
> Thank for the advice, it works with NFS !
> 
> But :
> 
> 1) it doesn't work anymore, if I remove --prefix /Network/opt/openmpi-1.4.2 
> (is there a way to remove it on OSX, already declared  ?)
> 
> 2) I must use the option -static-intel at link else i have a problem with 
> libiomp5.dylib not found
> 
> Christophe
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] Rép : openmpi + share points

2010-05-20 Thread Christophe Peyret
Hello,

Thank for the advice, it works with NFS !

But :

1) it doesn't work anymore, if I remove --prefix /Network/opt/openmpi-1.4.2 (is 
there a way to remove it on OSX, already declared  ?)

2) I must use the option -static-intel at link else i have a problem with 
libiomp5.dylib not found

Christophe




Re: [OMPI users] Buffer size limit and memory consumption problem on heterogeneous (32 bit / 64 bit) machines

2010-05-20 Thread Olivier Riff
2010/5/20 Nysal Jan 

> This probably got fixed in https://svn.open-mpi.org/trac/ompi/ticket/2386
> Can you try 1.4.2, the fix should be in there.
>
>

I will test it soon (takes some time to install the new version on each
node) . It would be perfect if it fixes it.
I will tell you the result asap.

Thanks.

Olivier







> Regards
> --Nysal
>
>
> On Thu, May 20, 2010 at 2:02 PM, Olivier Riff wrote:
>
>> Hello,
>>
>> I assume this question has been already discussed many times, but I can
>> not find on Internet a solution to my problem.
>> It is about buffer size limit of MPI_Send and MPI_Recv with heterogeneous
>> system (32 bit laptop / 64 bit cluster).
>> My configuration is :
>> open mpi 1.4, configured with: --without-openib --enable-heterogeneous
>> --enable-mpi-threads
>> Program is launched a laptop (32 bit Mandriva 2008) which distributes
>> tasks to do to a cluster of 70 processors  (64 bit RedHat Entreprise
>> distribution):
>> I have to send various buffer size from few bytes till 30Mo.
>>
>> I tested following commands:
>> 1) mpirun -v -machinefile machinefile.txt MyMPIProgram
>> -> crash on client side ( 64 bit RedHat Entreprise ) when sent buffer size
>> > 65536.
>> 2) mpirun --mca btl_tcp_eager_limit 3000 -v -machinefile
>> machinefile.txt MyMPIProgram
>> -> works but has the effect of generating gigantic memory consumption on
>> 32 bit machine side after MPI_Recv. Memory consumption goes from 800Mo to
>> 2,1Go after receiving about 20ko from each 70 clients ( a total of about 1.4
>> Mo ).  This makes my program crash later because I have no more memory to
>> allocate new structures. I read in a openmpi forum thread that setting
>> btl_tcp_eager_limit to a huge value explains this huge memory consumption
>> when a message sent does not have a preposted ready recv. Also after all
>> messages have been received and there is no more traffic activity : the
>> memory consumed remains at 2.1go... and I do not understand why.
>>
>> What is the best way to do in order to have a working program which also
>> has a small memory consumption (the speed performance can be lower) ?
>> I tried to play with mca paramters btl_tcp_sndbuf and mca btl_tcp_rcvbuf,
>> but without success.
>>
>> Thanks in advance for you help.
>>
>> Best regards,
>>
>> Olivier
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Buffer size limit and memory consumption problem on heterogeneous (32 bit / 64 bit) machines

2010-05-20 Thread Olivier Riff
Hello Terry,

Thanks for your answer.

2010/5/20 Terry Dontje 

>  Olivier Riff wrote:
>
> Hello,
>
> I assume this question has been already discussed many times, but I can not
> find on Internet a solution to my problem.
> It is about buffer size limit of MPI_Send and MPI_Recv with heterogeneous
> system (32 bit laptop / 64 bit cluster).
> My configuration is :
> open mpi 1.4, configured with: --without-openib --enable-heterogeneous
> --enable-mpi-threads
> Program is launched a laptop (32 bit Mandriva 2008) which distributes tasks
> to do to a cluster of 70 processors  (64 bit RedHat Entreprise
> distribution):
> I have to send various buffer size from few bytes till 30Mo.
>
>  You really want to get your program running without the tcp_eager_limit
> set if you want a better usage of memory.  I believe the crash has something
> to do with the rendezvous protocol in OMPI.  Have you narrowed this failure
> down to a simple MPI program?  Also I noticed that you're configuring with
> --enable-mpi-threads, have you tried configuring without that option?
>
>
-> No, unfortunately I did not narrowed this behaviour to a simple MPI
program. I think I will have to do it if I do not find a solution in the
next days.
I will also make the test without the --enable-mpi-threads configuration.


> I tested following commands:
> 1) mpirun -v -machinefile machinefile.txt MyMPIProgram
> -> crash on client side ( 64 bit RedHat Entreprise ) when sent buffer size
> > 65536.
> 2) mpirun --mca btl_tcp_eager_limit 3000 -v -machinefile
> machinefile.txt MyMPIProgram
> -> works but has the effect of generating gigantic memory consumption on 32
> bit machine side after MPI_Recv. Memory consumption goes from 800Mo to 2,1Go
> after receiving about 20ko from each 70 clients ( a total of about 1.4 Mo
> ).  This makes my program crash later because I have no more memory to
> allocate new structures. I read in a openmpi forum thread that setting
> btl_tcp_eager_limit to a huge value explains this huge memory consumption
> when a message sent does not have a preposted ready recv. Also after all
> messages have been received and there is no more traffic activity : the
> memory consumed remains at 2.1go... and I do not understand why.
>
> Are the 70 clients all on different nodes?  I am curious if the 2.1GB is
> due to the SM BTL or possibly a leak in the TCP BTL.
>

No, 70 clients are only on 9 nodes. In fact it is 72 clients: they are nine
8-processor machines.
The 2.1Gb memory consumption appears when I sequentially try to read the
result on each 72 clients (for loop from 1 to 72 calling MPI_Recv). I assume
that many clients have already sent the result whereas the server has not
called the MPI_Rec for the corresponding rank yet.


>
> What is the best way to do in order to have a working program which also
> has a small memory consumption (the speed performance can be lower) ?
> I tried to play with mca paramters btl_tcp_sndbuf and mca btl_tcp_rcvbuf,
> but without success.
>
> Thanks in advance for you help.
>
> Best regards,
>
> Olivier
>
> --
>
> ___
> users mailing 
> listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> [image: Oracle]
>   Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.650.633.7054
>  Oracle * - Performance Technologies*
>  95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Buffer size limit and memory consumption problem on heterogeneous (32 bit / 64 bit) machines

2010-05-20 Thread Nysal Jan
This probably got fixed in https://svn.open-mpi.org/trac/ompi/ticket/2386
Can you try 1.4.2, the fix should be in there.

Regards
--Nysal


On Thu, May 20, 2010 at 2:02 PM, Olivier Riff wrote:

> Hello,
>
> I assume this question has been already discussed many times, but I can not
> find on Internet a solution to my problem.
> It is about buffer size limit of MPI_Send and MPI_Recv with heterogeneous
> system (32 bit laptop / 64 bit cluster).
> My configuration is :
> open mpi 1.4, configured with: --without-openib --enable-heterogeneous
> --enable-mpi-threads
> Program is launched a laptop (32 bit Mandriva 2008) which distributes tasks
> to do to a cluster of 70 processors  (64 bit RedHat Entreprise
> distribution):
> I have to send various buffer size from few bytes till 30Mo.
>
> I tested following commands:
> 1) mpirun -v -machinefile machinefile.txt MyMPIProgram
> -> crash on client side ( 64 bit RedHat Entreprise ) when sent buffer size
> > 65536.
> 2) mpirun --mca btl_tcp_eager_limit 3000 -v -machinefile
> machinefile.txt MyMPIProgram
> -> works but has the effect of generating gigantic memory consumption on 32
> bit machine side after MPI_Recv. Memory consumption goes from 800Mo to 2,1Go
> after receiving about 20ko from each 70 clients ( a total of about 1.4 Mo
> ).  This makes my program crash later because I have no more memory to
> allocate new structures. I read in a openmpi forum thread that setting
> btl_tcp_eager_limit to a huge value explains this huge memory consumption
> when a message sent does not have a preposted ready recv. Also after all
> messages have been received and there is no more traffic activity : the
> memory consumed remains at 2.1go... and I do not understand why.
>
> What is the best way to do in order to have a working program which also
> has a small memory consumption (the speed performance can be lower) ?
> I tried to play with mca paramters btl_tcp_sndbuf and mca btl_tcp_rcvbuf,
> but without success.
>
> Thanks in advance for you help.
>
> Best regards,
>
> Olivier
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Buffer size limit and memory consumption problem on heterogeneous (32 bit / 64 bit) machines

2010-05-20 Thread Terry Dontje

Olivier Riff wrote:

Hello,

I assume this question has been already discussed many times, but I 
can not find on Internet a solution to my problem.
It is about buffer size limit of MPI_Send and MPI_Recv with 
heterogeneous system (32 bit laptop / 64 bit cluster).

My configuration is :
open mpi 1.4, configured with: --without-openib --enable-heterogeneous 
--enable-mpi-threads
Program is launched a laptop (32 bit Mandriva 2008) which distributes 
tasks to do to a cluster of 70 processors  (64 bit RedHat Entreprise 
distribution):

I have to send various buffer size from few bytes till 30Mo.

You really want to get your program running without the tcp_eager_limit 
set if you want a better usage of memory.  I believe the crash has 
something to do with the rendezvous protocol in OMPI.  Have you narrowed 
this failure down to a simple MPI program?  Also I noticed that you're 
configuring with --enable-mpi-threads, have you tried configuring 
without that option?

I tested following commands:
1) mpirun -v -machinefile machinefile.txt MyMPIProgram
-> crash on client side ( 64 bit RedHat Entreprise ) when sent buffer 
size > 65536.
2) mpirun --mca btl_tcp_eager_limit 3000 -v -machinefile 
machinefile.txt MyMPIProgram
-> works but has the effect of generating gigantic memory consumption 
on 32 bit machine side after MPI_Recv. Memory consumption goes from 
800Mo to 2,1Go after receiving about 20ko from each 70 clients ( a 
total of about 1.4 Mo ).  This makes my program crash later because I 
have no more memory to allocate new structures. I read in a openmpi 
forum thread that setting btl_tcp_eager_limit to a huge value explains 
this huge memory consumption when a message sent does not have a 
preposted ready recv. Also after all messages have been received and 
there is no more traffic activity : the memory consumed remains at 
2.1go... and I do not understand why.
Are the 70 clients all on different nodes?  I am curious if the 2.1GB is 
due to the SM BTL or possibly a leak in the TCP BTL.


What is the best way to do in order to have a working program which 
also has a small memory consumption (the speed performance can be lower) ?
I tried to play with mca paramters btl_tcp_sndbuf and mca 
btl_tcp_rcvbuf, but without success.


Thanks in advance for you help.

Best regards,

Olivier


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.650.633.7054
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 



[OMPI users] General question on the implementation of a "scheduler" on client side...

2010-05-20 Thread Olivier Riff
Hello,

I have a general question about the best way to implement an openmpi
application, i.e the design of the application.

A machine (I call it the "server") should send to a cluster containing a lot
of processors (the "clients") regularly task to do (byte buffers from very
various size).
The server should send to each client a different buffer, then wait for each
client answers (buffer sent by each client after some processing), and
retrieve the result data.

First I made something looking like this.
On the server side: Send sequentially to each client buffers using MPI_Send.
On each client side: loop which waits a buffer using MPI_Recv, then process
the buffer and sends the result using MPI_Send
This is really not efficient because a lot of time is lost due to the fact
that the server sends and receives sequentially the buffers.
It only has the advantage to have on the client size a pretty easy
scheduler:
Wait for buffer (MPI_Recv) -> Analyse it -> Send result (MPI_Send)

My wish is to mix MPI_Send/MPI_Recv and other mpi functions like
MPI_BCast/MPI_Scatter/MPI_Gather... (like I imagine every mpi application
does).
The problem is that I cannot find a easy solution in order that each client
knows which kind of mpi function is currently called by the server. If the
server calls MPI_BCast the client should do the same. Sending at each time a
first message to indicate the function the server will call next does not
look very nice. Though I do not see an easy/best way to implement an
"adaptative" scheduler on the client side.

Any tip, advice, help would be appreciate.


Thanks,

Olivier


[OMPI users] Buffer size limit and memory consumption problem on heterogeneous (32 bit / 64 bit) machines

2010-05-20 Thread Olivier Riff
Hello,

I assume this question has been already discussed many times, but I can not
find on Internet a solution to my problem.
It is about buffer size limit of MPI_Send and MPI_Recv with heterogeneous
system (32 bit laptop / 64 bit cluster).
My configuration is :
open mpi 1.4, configured with: --without-openib --enable-heterogeneous
--enable-mpi-threads
Program is launched a laptop (32 bit Mandriva 2008) which distributes tasks
to do to a cluster of 70 processors  (64 bit RedHat Entreprise
distribution):
I have to send various buffer size from few bytes till 30Mo.

I tested following commands:
1) mpirun -v -machinefile machinefile.txt MyMPIProgram
-> crash on client side ( 64 bit RedHat Entreprise ) when sent buffer size >
65536.
2) mpirun --mca btl_tcp_eager_limit 3000 -v -machinefile machinefile.txt
MyMPIProgram
-> works but has the effect of generating gigantic memory consumption on 32
bit machine side after MPI_Recv. Memory consumption goes from 800Mo to 2,1Go
after receiving about 20ko from each 70 clients ( a total of about 1.4 Mo
).  This makes my program crash later because I have no more memory to
allocate new structures. I read in a openmpi forum thread that setting
btl_tcp_eager_limit to a huge value explains this huge memory consumption
when a message sent does not have a preposted ready recv. Also after all
messages have been received and there is no more traffic activity : the
memory consumed remains at 2.1go... and I do not understand why.

What is the best way to do in order to have a working program which also has
a small memory consumption (the speed performance can be lower) ?
I tried to play with mca paramters btl_tcp_sndbuf and mca btl_tcp_rcvbuf,
but without success.

Thanks in advance for you help.

Best regards,

Olivier


Re: [OMPI users] Allgather in inter-communicator bug,

2010-05-20 Thread jody
Hi
I am really no python expert, but it looks to me as if you were
gathering arrays filled with zeroes:
  a = array('i', [0]) * n

Shouldn't this line be
  a = array('i', [r])*n
where r is the rank of the process?

Jody


On Thu, May 20, 2010 at 12:00 AM, Battalgazi YILDIRIM
 wrote:
> Hi,
>
>
> I am trying to use intercommunicator ::Allgather between two child process.
> I have fortran and Python code,
> I am using mpi4py for python. It seems that ::Allgather is not working
> properly in my desktop.
>
>  I have contacted first mpi4py developers (Lisandro Dalcin), he simplified
> my problem and provided two example files (python.py and fortran.f90,
> please see below).
>
> We tried with different MPI vendors, the following example worked correclty(
> it means the final print out should be array('i', [1, 2, 3, 4, 5, 6, 7, 8])
> )
>
> However, it is not giving correct answer in my two desktop (Redhat and
> ubuntu) both
> using OPENMPI
>
> Could yo look at this problem please?
>
> If you want to follow our discussion before you, you can go to following
> link:
> http://groups.google.com/group/mpi4py/browse_thread/thread/c17c660ae56ff97e
>
> yildirim@memosa:~/python_intercomm$ more python.py
> from mpi4py import MPI
> from array import array
> import os
>
> progr = os.path.abspath('a.out')
> child = MPI.COMM_WORLD.Spawn(progr,[], 8)
> n = child.remote_size
> a = array('i', [0]) * n
> child.Allgather([None,MPI.INT],[a,MPI.INT])
> child.Disconnect()
> print a
>
> yildirim@memosa:~/python_intercomm$ more fortran.f90
> program main
>  use mpi
>  implicit none
>  integer :: parent, rank, val, dummy, ierr
>  call MPI_Init(ierr)
>  call MPI_Comm_get_parent(parent, ierr)
>  call MPI_Comm_rank(parent, rank, ierr)
>  val = rank + 1
>  call MPI_Allgather(val,   1, MPI_INTEGER, &
>                     dummy, 0, MPI_INTEGER, &
>                     parent, ierr)
>  call MPI_Comm_disconnect(parent, ierr)
>  call MPI_Finalize(ierr)
> end program main
>
> yildirim@memosa:~/python_intercomm$ mpif90 fortran.f90
>
> yildirim@memosa:~/python_intercomm$ python python.py
> array('i', [0, 0, 0, 0, 0, 0, 0, 0])
>
>
> --
> B. Gazi YILDIRIM
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] How to show outputs from MPI program that runs on a cluster?

2010-05-20 Thread jody
Hi
mpirun has an option for this (check the mpirun man page):

   -tag-output, --tag-output
  Tag  each  line  of  output to stdout, stderr, and
stddiag with [jobid, rank] indicating the process jobid and
rank that generated the
  output, and the channel which generated it.

Using this you can filter the entire output by grepping for the required rank.

Another possibility is to use the option
   -xterm, --xterm 
  Display the specified ranks in separate xterm windows.
The ranks are specified as a comma-separated list of ranges, with a -1
indicating  all.
  A separate window will be created for each specified
rank.  Note: In some environments, xterm may require that the
executable be in the user’s
  path, or be specified in absolute or relative terms.
Thus, it may be necessary to specify a local executable as "./foo"
instead of just "foo".
  If xterm fails to find the executable, mpirun will hang,
but still respond correctly to a ctrl-c.  If this happens, please
check that the exe-
  cutable is being specified correctly and try again.

That way you can open a single terminal window for the process you are
interested in.


Jody


On Thu, May 20, 2010 at 1:28 AM, Sang Chul Choi  wrote:
> Hi,
>
> I am wondering if there is a way to run a particular process among multiple 
> processes on the console of a linux cluster.
>
> I want to see the screen output (standard output) of a particular process 
> (using a particular ID of a process) on the console screen while the MPI 
> program is running.  I think that if I run a MPI program on a linux cluster 
> using Sun Grid Engine, the particular process that prints out to standard 
> output could run on the console or computing node.   And, it would be hard to 
> see screen output of the particular process.  Is there a way to to set one 
> process aside and to run it on the console in Sun Grid Engine?
>
> When I run the MPI program on my desktop with quad cores, I can set aside one 
> process using an ID to print information that I need.  I do not know how I 
> could do that in much larger scale like using Sun Grid Engine.  I could let 
> one process print out in a file and then I could see it.  I do not know how I 
> could let one process to print out on the console screen by setting it to run 
> on the console using Sun Grid Engine or any other similar thing such as PBS.  
> I doubt that a cluster would allow jobs to run on the console because then 
> others users would have to be in trouble in submitting jobs.  If this is the 
> case, there seem no way to print out on the console.   Then, do I have to 
> have a separate (non-MPI) program that can communicate with MPI program using 
> TCP/IP by running the separate program on the master node of a cluster?  This 
> separate non-MPI program may then communicate sporadically with the MPI 
> program.  I do not know if this is a general approach or a peculiar way.
>
> I will appreciate any of input.
>
> Thank you,
>
> Sang Chul
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>