Re: [OMPI devel] init_thread + spawn error

2008-04-03 Thread Ralph Castain
I believe we have stated several times that we are not thread safe at this time. You are welcome to try it, but shouldn't be surprised when it fails. Ralph On 4/3/08 4:18 PM, "Joao Vicente Lima" wrote: > Hi, > I getting a error on call init_thread and comm_spawn on this code: > > #include "mp

[OMPI devel] init_thread + spawn error

2008-04-03 Thread Joao Vicente Lima
Hi, I getting a error on call init_thread and comm_spawn on this code: #include "mpi.h" #include int main (int argc, char *argv[]) { int provided; MPI_Comm parentcomm, intercomm; MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided); MPI_Comm_get_parent (&parentcomm); if (par

Re: [OMPI devel] MPI_Comm_connect/Accept

2008-04-03 Thread Ralph Castain
Take a gander at ompi/tools/ompi-server - I believe I put a man page in there. You might just try "man ompi-server" and see if it shows up. Holler if you have a question - not sure I documented it very thoroughly at the time. On 4/3/08 3:10 PM, "Aurélien Bouteiller" wrote: > Ralph, > > > I a

Re: [OMPI devel] MPI_Comm_connect/Accept

2008-04-03 Thread Aurélien Bouteiller
Ralph, I am using trunk. Is there a documentation for ompi-server ? Sounds exactly like what I need to fix point 1. Aurelien Le 3 avr. 08 à 17:06, Ralph Castain a écrit : I guess I'll have to ask the basic question: what version are you using? If you are talking about the trunk, there n

Re: [OMPI devel] MPI_Comm_connect/Accept

2008-04-03 Thread Ralph Castain
I guess I'll have to ask the basic question: what version are you using? If you are talking about the trunk, there no longer is a "universe" concept anywhere in the code. Two mpiruns can connect/accept to each other as long as they can make contact. To facilitate that, we created an "ompi-server"

[OMPI devel] MPI_Comm_connect/Accept

2008-04-03 Thread Aurélien Bouteiller
Hi everyone, I'm trying to figure out how complete is the implementation of Comm_connect/Accept. I found two problematic cases. 1) Two different programs are started in two different mpirun. One makes accept, the second one use connect. I would not expect MPI_Publish_name/Lookup_name to w

Re: [OMPI devel] RFC: changes to modex

2008-04-03 Thread Jeff Squyres
On Apr 3, 2008, at 11:16 AM, Jeff Squyres wrote: The size of the openib modex is explained in btl_openib_component.c in the branch. It's a packed message now; we don't just blindly copy an entire struct. Here's the comment: /* The message is packed into multiple parts: * 1. a uint8_t

Re: [OMPI devel] RFC: changes to modex

2008-04-03 Thread Jeff Squyres
On Apr 3, 2008, at 8:52 AM, Gleb Natapov wrote: It'll increase it compared to the optimization that we're about to make. But it will certainly be a large decrease compared to what we're doing today May be I don't understand something in what you propose then. Currently when I run two procs

Re: [OMPI devel] Ssh tunnelling broken in trunk?

2008-04-03 Thread Jon Mason
On Wednesday 02 April 2008 08:04:10 pm Ralph Castain wrote: > Hmmm...something isn't making sense. Can I see the command line you used to > generate this? mpirun --n 2 --host vic12,vic20 -mca btl openib,self --mca btl_openib_receive_queues P,65536,256,128,128 -d xterm -e gdb /usr/mpi/gcc/openmpi

Re: [OMPI devel] RFC: changes to modex

2008-04-03 Thread Jeff Squyres
On Apr 3, 2008, at 9:18 AM, Gleb Natapov wrote: I am talking about openib part of the modex. The "garbage" I am referring to is this: FWIW, on the openib-cpc2 branch, the base data that is sent in the modex is this: uint64_t subnet_id; /** LID of this port */ uint16_t lid; /

Re: [OMPI devel] RFC: changes to modex

2008-04-03 Thread Gleb Natapov
On Thu, Apr 03, 2008 at 07:05:28AM -0600, Ralph H Castain wrote: > H...since I have no control nor involvement in what gets sent, perhaps I > can be a disinterested third party. ;-) > > Could you perhaps explain this comment: > > > BTW I looked at how we do modex now on the trunk. For OOB cas

Re: [OMPI devel] RFC: changes to modex

2008-04-03 Thread Ralph H Castain
H...since I have no control nor involvement in what gets sent, perhaps I can be a disinterested third party. ;-) Could you perhaps explain this comment: > BTW I looked at how we do modex now on the trunk. For OOB case more > than half the data we send for each proc is garbage. What "garbage

Re: [OMPI devel] RFC: changes to modex

2008-04-03 Thread Gleb Natapov
On Wed, Apr 02, 2008 at 08:41:14PM -0400, Jeff Squyres wrote: > >> that it's the same for all procs on all hosts. I guess there's a few > >> cases: > >> > >> 1. homogeneous include/exclude, no carto: send all in node info; no > >> proc info > >> 2. homogeneous include/exclude, carto is used: send