Jose --
This sounds like a problem that we just recently fixed in the 1.0.x
branch -- there were some situations where the "wrong" ethernet
device could have been picked by Open MPI (e.g., if you have a
cluster with all private IP addresses, and you run an MPI job that
spans the head node
Finally it was a network problem. I had to disable one network interface in
the master node of the cluster by setting
btl_tcp_if_include = eth1 on file /usr/local/etc/openmpi-mca-params.conf
thank you all for your help.
Jose Pedro
On 3/1/06, Jose Pedro Garcia Mahedero wrote:
>
> OK, it ALMOST w
OK, it ALMOST works!!
Now I've install MPI on a non clustered machine and it works, but
surprisingly, it works fine from machine OUT1 as master to machine CLUSTER1
as slave, but (here was my surprise) it doesn't work on the other sense! If
I run the same program with CLUSTER1 as master it only sen
You're right, I'll try to use netpipes first and then the application. If
it doesn't workt I'll send configs and more detailed informations
Thank you!
On 3/1/06, Brian Barrett wrote:
>
> Jose -
>
> I noticed that your output doesn't appear to match what the source
> code is capable of generatin
Jose -
I noticed that your output doesn't appear to match what the source
code is capable of generating. It's possible that you're running
into problems with the code that we can't see because you didn't send
a complete version of the source code.
You might want to start by running some
Argh, sorry for the b/w misuse. I think I got this wrong on my first
test program too.
Maybe output is stuck in the stdout buffers. I don't see that the slave
is ever going to exit (no DIETAG).
Spoke before thinking,
/jr
Jose Pedro Garcia Mahedero wrote:
Mmmh I don't understand you:
Mmmh I don't understand you:
My (slave) call is:
MPI_Recv(&work, 1, MPI_INT, 0, MPI_ANY_TAG,MPI_COMM_WORLD, &status);
And MPI_Recv signature is:
int MPI_Recv( void *buf, int count, MPI_Datatype datatype, int source, int
tag, MPI_Comm comm, MPI_Status *status )
So:
void *buf -> work
int count
Your MPI_Recv is trying to receive from the slave(1), not the master (0).
Jose Pedro Garcia Mahedero wrote:
Hello everybody.
I'm new to MPI and I'm having some problems while runnig a simple
pingpong program in more than one node.
1.- I followed all the instructions and installed open MPI wi
Hello everybody.
I'm new to MPI and I'm having some problems while runnig a simple pingpong
program in more than one node.
1.- I followed all the instructions and installed open MPI without problems
in a Beowulf cluster.
2.- Ths cluster is working OK and ssh keys are set for not password
prompt