Re: [OMPI users] benchmark - mpi_reduce() called only once but takes long time - proportional to calculation time

2009-12-04 Thread Qing Pang
Thank you so much! It is a synchronization issue. In my case, one node 
actually run slower than the other node. Adding MPE_Barrier() helps to 
straight things out.

Thank you for your help!

Eugene Loh wrote:
Your processes are probably running asynchronously.  You could perhaps 
try tracing program execution and look at the timeline.  E.g., .  Or, 
where you have MPI_Wtime calls, just capture those timestamps on each 
process and dump the results at the end of your run.  Or, report 
timings for all ranks instead of just for rank 0.

Put another way, rank 0 must broadcast n.  So, no one starts 
computation until they get the Bcast result.  Rank 0 probably starts 
its computations before anyone else does.  So, it gets to the Reduce 
before anyone else does, but it can't exit until other ranks have 
finished their computations.  So, the Reduce time on rank 0 includes 
some amount of other ranks' compute times.

Yet another approach is to insert MPI_Barrier calls at each phase of 
the program so that the various phases are synchronized.  This adds 
some overhead to the program, but helps simplify interpretation of the 
timing results.

Qing Pang wrote:

I'm running the popular Calculate PI program on a 2 node setting 
running ubuntu 8.10 and openmpi1.3.3(with default settings). 
Password-less ssh is set up but no cluster management program such as 
network file system, network time protocol, resource management, 
scheduler, etc. The two nodes are connected though TCP/IP only.

When I tried to benchmark the program, it shows that the time spent 
on MPI_Reduce(), is proportional to the Number-of-Intervals (n) used 
in calculation. For example, when n = 1,000,000, MPI_Reduce costs 
15.65 milliseconds; while n= 1,000,000,000,  MPI_Reduce costs 15526 

This confused me - in this Calc-PI program, MPI_Reduce is used only 
once - no matter what number of intervals is used, MPI_Reduce is 
called after both nodes got the result, to merge the result - just 
once.  So the time cost by MPI_Reduce (all though it might be slow 
through TCP/IP connection) should be somewhat consistent. But 
obviously it's not what I saw.

Had anyone have the similar problem before? I'm not sure how 
MPI_Reduce() work internally. Does the fact that I don't have network 
file system, network time protocol, resource management, scheduler, 
etc installed matters?

Below is the program - I did feed "n" to it more than once to warm it 

#include "mpi.h"

int main(int argc, char *argv[])   {  int numprocs, myid, rc;
   double ACCUPI = 3.1415926535897932384626433832795;
   double mypi, pi, h, sum, x;
   int n, i;
   double starttime, endtime;
   double time,told,bcasttime,reducetime,comptime,totaltime;

   rc = MPI_Init(,);
   if (rc != MPI_SUCCESS) {
  printf("Error starting MPI program. Terminating.\n");
  MPI_Abort(MPI_COMM_WORLD, rc);

   while (1) {
  if (myid == 0) {
 printf("Enter the number of intervals: (0 quits) \n");
 starttime = MPI_Wtime();

  time = MPI_Wtime();
  MPI_Bcast(, 1, MPI_INT, 0, MPI_COMM_WORLD);

  told = time;
  time = MPI_Wtime();
  bcasttime = time - told;

  if (n == 0)
  else {
 h = 1.0/(double)n;
 sum = 0.0;
 for (i = myid + 1; i <= n; i += numprocs) {
 x = h*((double)i - 0.5);
 sum += (4.0/(1.0 + x*x));
 mypi = sum*h;

 told = time;
 time = MPI_Wtime();
 comptime = time - told;

 MPI_Reduce(, , 1, MPI_DOUBLE, MPI_SUM, 0, 

 told = time;
 time = MPI_Wtime();
 reducetime = time - told;

 if (myid == 0) {
totaltime = MPI_Wtime() - starttime;
printf("\nElapsed time (total): %f 
printf("Elapsed time (Bcast):  %f milliseconds 
printf("Elapsed time (Reduce): %f milliseconds 
printf("Elapsed time (Comput): %f milliseconds 
printf("\nApproximated pi is %.16f, Error is %.4e\n", pi, 
fabs(pi - ACCUPI));


   MPI_Finalize();   }

users mailing list

[OMPI users] benchmark - mpi_reduce() called only once but takes long time - proportional to calculation time

2009-11-25 Thread Qing Pang

Dear users,

I'm running the popular Calculate PI program on a 2 node setting running 
ubuntu 8.10 and openmpi1.3.3(with default settings). Password-less ssh 
is set up but no cluster management program such as network file system, 
network time protocol, resource management, scheduler, etc. The two 
nodes are connected though TCP/IP only.

When I tried to benchmark the program, it shows that the time spent on 
MPI_Reduce(), is proportional to the Number-of-Intervals (n) used in 
calculation. For example, when n = 1,000,000, MPI_Reduce costs 15.65 
milliseconds; while n= 1,000,000,000,  MPI_Reduce costs 15526 milliseconds.

This confused me - in this Calc-PI program, MPI_Reduce is used only once 
- no matter what number of intervals is used, MPI_Reduce is called after 
both nodes got the result, to merge the result - just once.  So the time 
cost by MPI_Reduce (all though it might be slow through TCP/IP 
connection) should be somewhat consistent. But obviously it's not what I 

Had anyone have the similar problem before? I'm not sure how 
MPI_Reduce() work internally. Does the fact that I don't have network 
file system, network time protocol, resource management, scheduler, etc 
installed matters?

Below is the program - I did feed "n" to it more than once to warm it up.

#include "mpi.h"

int main(int argc, char *argv[])   
  int numprocs, myid, rc;

  double ACCUPI = 3.1415926535897932384626433832795;
  double mypi, pi, h, sum, x;
  int n, i;
  double starttime, endtime;
  double time,told,bcasttime,reducetime,comptime,totaltime;

  rc = MPI_Init(,);
  if (rc != MPI_SUCCESS) {
 printf("Error starting MPI program. Terminating.\n");

  while (1) {
 if (myid == 0) {
printf("Enter the number of intervals: (0 quits) \n");
starttime = MPI_Wtime();

 time = MPI_Wtime();
 MPI_Bcast(, 1, MPI_INT, 0, MPI_COMM_WORLD);

 told = time;
 time = MPI_Wtime();
 bcasttime = time - told;

 if (n == 0)
 else {
h = 1.0/(double)n;
sum = 0.0;
for (i = myid + 1; i <= n; i += numprocs) {
x = h*((double)i - 0.5);
sum += (4.0/(1.0 + x*x));
mypi = sum*h;

told = time;
time = MPI_Wtime();
comptime = time - told;


told = time;
time = MPI_Wtime();
reducetime = time - told;

if (myid == 0) {
   totaltime = MPI_Wtime() - starttime;
   printf("\nElapsed time (total): %f 
   printf("Elapsed time (Bcast):  %f milliseconds 
   printf("Elapsed time (Reduce): %f milliseconds 
   printf("Elapsed time (Comput): %f milliseconds 
   printf("\nApproximated pi is %.16f, Error is %.4e\n", pi, 
fabs(pi - ACCUPI));



Re: [OMPI users] Problem with mpirun -preload-binary option

2009-11-12 Thread Qing Pang
Now that I have passwordless-ssh set up both directions, and verified 
working - I still have the same problem.
I'm able to run ssh/scp on both master and client nodes - (at this 
point, they are pretty much the same), without being asked for password. 
And mpirun works fine if I have the executable put in the same directory 
on both nodes.

But when I tried the preload-binary option, I still have the same 
problem - it asked me for the password of the node running mpirun, and 
then tells that scp failed.


Josh Wrote:

Though the --preload-binary option was created while building the 
checkpoint/restart functionality it does not depend on 
checkpoint/restart function in any way (just a side effect of the 
initial development).

The problem you are seeing is a result of the computing environment 
setup of password-less ssh. The --preload-binary command uses 'scp' (at 
the moment) to copy the files from the node running mpirun to the 
compute nodes. The compute nodes are the ones that call 'scp', so you 
will need to setup password-less ssh in both directions.

-- Josh

On Nov 11, 2009, at 8:38 AM, Ralph Castain wrote:

 I'm no expert on the preload-binary option - but I would suspect that 

is the case given your observations.

 That option was created to support checkpoint/restart, not for what 
you are attempting to do. Like I said, you -should- be able to use it 
for that purpose, but I expect you may hit a few quirks like this along 
the way.

 On Nov 11, 2009, at 9:16 AM, Qing Pang wrote:

> Thank you very much for your help! I believe I do have password-less 
ssh set up, at least from master node to client node (desktop -> laptop 
in my case). If I type >ssh node1 on my desktop terminal, I am able to 
get to the laptop node without being asked for password. And as I 
mentioned, if I copy the example executable from desktop to the laptop 
node using scp, then I am able to run it from desktop using both nodes.
> Back to the preload-binary problem - I am asked for the password of 
my master node - the node I am working on - not the remote client node. 
Do you mean that I should set up password-less ssh in both direction? 
Does the client node need to access master node through password-less 
ssh to make the preload-binary option work?

> Ralph Castain Wrote:
> It -should- work, but you need password-less ssh setup. See our FAQ
> for how to do that, if you are unfamiliar with it.
> On Nov 10, 2009, at 2:02 PM, Qing Pang wrote:
> I'm having problem getting the mpirun "preload-binary" option to work.
>> I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with 

Ethernet cable.
>> If I copy the executable to client nodes using scp, then do mpirun, 

everything works.

>> But I really want to avoid the copying, so I tried the 

-preload-binary option.

>> When I typed the command on my master node as below (gordon-desktop 

is my master node, and gordon-laptop is the client node):



>> gordon_at_gordon-desktop:~/Desktop/openmpi-1.3.3/examples$ mpirun
>> -machinefile machine.linux -np 2 --preload-binary $(pwd)/hello_c.out


>> I got the following:
>> gordon_at_gordon-desktop's password: (I entered my password here, 

why am I asked for the password? I am working under this account anyway)

>> WARNING: Remote peer ([[18118,0],1]) failed to preload a file.
>> Exit Status: 256
>> Local File: 


>> Remote File: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
>> Command:
>> scp 


>> /tmp/openmpi-sessions-gordon_at_gordon-laptop_0/18118/0/hello_c.out
>> Will continue attempting to launch the process(es).


>> mpirun was unable to launch the specified application as it could 

not access

>> or execute an executable:
>> Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
>> Node: node1
>> while attempting to start process rank 1.


>> Had anyone succeeded with the 'preload-binary' option with the 
similar settings? I assume this mpirun option should work when compiling 
openmpi with default options? Anything I need to set?

>> --qing

Re: [OMPI users] Problem with mpirun -preload-binary option

2009-11-11 Thread Qing Pang
Thank you very much for your help! I believe I do have password-less ssh 
set up, at least from master node to client node (desktop -> laptop in 
my case). If I type >ssh node1 on my desktop terminal, I am able to get 
to the laptop node without being asked for password. And as I mentioned, 
if I copy the example executable from desktop to the laptop node using 
scp, then I am able to run it from desktop using both nodes.
Back to the preload-binary problem - I am asked for the password of my 
master node - the node I am working on - not the remote client node. Do 
you mean that I should set up password-less ssh in both direction? Does 
the client node need to access master node through password-less ssh to 
make the preload-binary option work?

Ralph Castain Wrote:

It -should- work, but you need password-less ssh setup. See our FAQ
for how to do that, if you are unfamiliar with it.

On Nov 10, 2009, at 2:02 PM, Qing Pang wrote:

I'm having problem getting the mpirun "preload-binary" option to work.

I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with Ethernet 
If I copy the executable to client nodes using scp, then do mpirun, 
everything works.

But I really want to avoid the copying, so I tried the -preload-binary 

When I typed the command on my master node as below (gordon-desktop is 
my master node, and gordon-laptop is the client node):


gordon_at_gordon-desktop:~/Desktop/openmpi-1.3.3/examples$  mpirun
-machinefile machine.linux -np 2 --preload-binary $(pwd)/hello_c.out

I got the following:

gordon_at_gordon-desktop's password:  (I entered my password here, why 
am I asked for the password? I am working under this account anyway)

WARNING: Remote peer ([[18118,0],1]) failed to preload a file.

Exit Status: 256
Local  File: 

Remote File: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out


Will continue attempting to launch the process(es).


mpirun was unable to launch the specified application as it could not 

or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Node: node1

while attempting to start process rank 1.

Had anyone succeeded with the 'preload-binary' option with the similar 
settings? I assume this mpirun option should work when compiling 
openmpi with default  options? Anything I need to set?


[OMPI users] Problem with mpirun -preload-binary option

2009-11-10 Thread Qing Pang

I'm having problem getting the mpirun "preload-binary" option to work.

I'm using ubutu8.10 with openmpi 1.3.3, nodes connected with Ethernet cable.
If I copy the executable to client nodes using scp, then do mpirun, 
everything works.

But I really want to avoid the copying, so I tried the -preload-binary 

When I typed the command on my master node as below (gordon-desktop is 
my master node, and gordon-laptop is the client node):

gordon_at_gordon-desktop:~/Desktop/openmpi-1.3.3/examples$  mpirun
-machinefile machine.linux -np 2 --preload-binary $(pwd)/hello_c.out

I got the following:

gordon_at_gordon-desktop's password:  (I entered my password here, why 
am I asked for the password? I am working under this account anyway)

WARNING: Remote peer ([[18118,0],1]) failed to preload a file.

Exit Status: 256
Local  File: 

Remote File: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out


Will continue attempting to launch the process(es).
mpirun was unable to launch the specified application as it could not 

or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out
Node: node1

while attempting to start process rank 1.

Had anyone succeeded with the 'preload-binary' option with the similar 
settings? I assume this mpirun option should work when compiling openmpi 
with default  options? Anything I need to set?


Re: [OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread Qing Pang

Thank you Jeff! That solves the problem. :-) You are the lifesaver!
So does that means I always need to copy my application to all the 
nodes? Or should I give the pathname of the my executable in a different 
way to avoid this? Do I need a network file system for that?

Jeff Squyres wrote:
The short version of the answer is to check to see that the executable 
is in the same location on both nodes (apparently: 
/home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out).  Open MPI is 
complaining that it can't find that specific executable on the .194 node.

See below for more detail.

On Nov 5, 2009, at 3:19 PM, qing pang wrote:

1) I'm trying to run opemMPI with the following setting:

1 PC (as master node) and 1 notebook (as client node) connected to an
ethernet router through ethernet cable. Both running Ubuntu 8.10.
There's no other connections. - Is this setting OK to run OpenMPI?


2) Prerequisites

SSH has been set up so that the master node can access the client node
through passwordless ssh. I do notice that it takes 10~15 seconds
between me entering '>ssh 'command and getting onto
the client node.
--- Could this be too slow for openmpi to run properlly?

Nope -- should be ok.

I do not have programs like network file system, network time protocol,
resource management, scheduler, etc installed.
--- Does OpenMPI need any prerequites other than passwordless ssh?

Not in this case, no.

3) OpenMPI is installed on both nodes - downloaded from,
and do configure/make all using Default Settings.

On both nodes,
which is the default setting in ubuntu.
LD_LIBRARY_PATH is set in ~/.bashrc - I added one line at the end of the
file, 'export LD_LIBRARY_PATH=usr/local/lib:usr/lib'
So when I echo them on both nodes, I get:
 >echo $PATH


But, if I do
 >ssh  'echo $LD_LIBRARY_PATH'
nothing comes back.

 >ssh  'echo $PATH'
comes back with the right path.

Is that a problem?


4) Problem:
I compiled the example Hello_c using
 >mpicc hello_c.c -o hello_c.out
and run them on both nodes locally, everything works fine.

But when I tried to run it on 2 nodes (-np 2)
 >mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out
I got the following error:


gordon@gordon-desktop:~/Desktop/openmpi-1.3.3/examples$ mpirun
--machinefile machine.linux -np 2 $(pwd)/hello_c.out

mpirun was unable to launch the specified application as it could not 

or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out

You are giving an absolute pathname in the mpirun command line:

mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out

Hence, it's looking for exactly 
/home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out on both 
nodes.  If the executable is in a different directory on the other 
node, that's where you're probably running into the problem.

[OMPI users] mpirun example program fail on multiple nodes - unable to launch specified application on client node

2009-11-05 Thread qing pang

Dear Sir/Madam,

I'm having problem running example program. Please kindly help --- I've 
been fooling with it for days, kind of getting lost.

MPIRUN fails on example hello prgram
-unable to launch the specified application on client node

1) I'm trying to run opemMPI with the following setting:

1 PC (as master node) and 1 notebook (as client node) connected to an 
ethernet router through ethernet cable. Both running Ubuntu 8.10. 
There's no other connections. - Is this setting OK to run OpenMPI?

2) Prerequisites

SSH has been set up so that the master node can access the client node 
through passwordless ssh. I do notice that it takes 10~15 seconds 
between me entering '>ssh 'command and getting onto 
the client node.

--- Could this be too slow for openmpi to run properlly?

I do not have programs like network file system, network time protocol, 
resource management, scheduler, etc installed.

--- Does OpenMPI need any prerequites other than passwordless ssh?

3) OpenMPI is installed on both nodes - downloaded from, 
and do configure/make all using Default Settings.

On both nodes,
PATH is 
which is the default setting in ubuntu.
LD_LIBRARY_PATH is set in ~/.bashrc - I added one line at the end of the 
file, 'export LD_LIBRARY_PATH=usr/local/lib:usr/lib'

So when I echo them on both nodes, I get:
>echo $PATH

But, if I do
>ssh  'echo $LD_LIBRARY_PATH'
nothing comes back.

>ssh  'echo $PATH'
comes back with the right path.

Is that a problem?

4) Problem:
I compiled the example Hello_c using
>mpicc hello_c.c -o hello_c.out
and run them on both nodes locally, everything works fine.

But when I tried to run it on 2 nodes (-np 2)
>mpirun -machinefile machine.linux -np 2 $(pwd)/hello_c.out
I got the following error:

gordon@gordon-desktop:~/Desktop/openmpi-1.3.3/examples$ mpirun 
--machinefile machine.linux -np 2 $(pwd)/hello_c.out

mpirun was unable to launch the specified application as it could not access
or execute an executable:

Executable: /home/gordon/Desktop/openmpi-1.3.3/examples/hello_c.out

while attempting to start process rank 1.

Sometimes I get one other error message after that:
[gordon-desktop:30748] [[25975,0],0]-[[25975,1],0] mca_oob_tcp_msg_recv: 
readv failed: Connection reset by peer (104)


5) Infomation attached:
ifconfig_masternode - output of ifconfig on masternode
ifconfig_slavenode - output of ifconfig on slavenode
ompi_info.txt - output of ompi_info -all
config.log - OpenMPI logfile
machine.linux - the machinefile used in mpirun command

Qing Pang
(601) 979 0270

Description: application/gzip
MPIRUN fails on example hello prgram 
-unable to launch the specified application on client node 

1) I'm trying to run opemMPI with the following setting:

1 PC (as master node) and 1 notebook (as client node) connected to an ethernet 
router through ethernet cable. Both running Ubuntu 8.10. There's no other 
connections. - Is this setting OK to run OpenMPI?

2) Prerequisites

SSH has been set up so that the master node can access the client node through 
passwordless ssh. I do notice that it takes 10~15 seconds between me entering 
'>ssh 'command and getting onto the client node. - Can this 
be too slow for openmpi to run properlly? 

I do not have programs like network file system, network time protocol, 
resource management, scheduler, etc installed. - Does OpenMPI have any 
prerequites other than passwordless ssh?

3) OpenMPI is installed on both nodes - downloaded from, and do 
configure/make all using Default Settings.

On both nodes,
PATH is 
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games, which 
is the default setting in ubuntu.
LD_LIBRARY_PATH is set in ~/.bashrc - I added one line at the end of the file, 
'export LD_LIBRARY_PATH=usr/local/lib:usr/lib'
So when I echo them on both nodes, I get:
>echo $PATH 