Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-06 Thread Ralph Castain
Yeah, that's a bit of a problem. The issue is that we tie a specific build to its prefix, which means we expect to find the OMPI binaries in that location. So you are supposed to set your path etc to point to the actual installation, not an abstraction. There are some ways to work around it

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-06 Thread Evan Samanas
Hi Ralph, Thanks for addressing this issue. I tried downloading your fork from that pull request and the seg fault appears to be gone. However I didn't install it on my remote machine before testing, and I got this error: bash: /opt/ompi-release-cmr-singlespawn/bin/orted: No such file or

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-05 Thread Ralph Castain
Okay, I tracked this down - thanks for your patience! I have a fix pending review. You can track it here: https://github.com/open-mpi/ompi-release/pull/179 > On Feb 4, 2015, at 5:14 PM, Evan Samanas wrote: > >

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-04 Thread Evan Samanas
Indeed, I simply commented out all the MPI_Info stuff, which you essentially did by passing a dummy argument. I'm still not able to get it to succeed. So here we go, my results defy logic. I'm sure this could be my fault...I've only been an occasional user of OpenMPI and MPI in general over the

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-03 Thread Ralph Castain
I confess I am sorely puzzled. I replace the Info key with MPI_INFO_NULL, but still had to pass a bogus argument to master since you still have the Info_set code in there - otherwise, info_set segfaults due to a NULL argv[1]. Doing that (and replacing "hostname" with an MPI example code) makes

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-03 Thread Evan Samanas
Yes, I did. I replaced the info argument of MPI_Comm_spawn with MPI_INFO_NULL. On Tue, Feb 3, 2015 at 5:54 PM, Ralph Castain wrote: > When running your comm_spawn code, did you remove the Info key code? You > wouldn't need to provide a hostfile or hosts any more, which is

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-03 Thread Ralph Castain
When running your comm_spawn code, did you remove the Info key code? You wouldn't need to provide a hostfile or hosts any more, which is why it should resolve that problem. I agree that providing either hostfile or host as an Info key will cause the program to segfault - I'm woking on that issue.

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-03 Thread Evan Samanas
Setting these environment variables did indeed change the way mpirun maps things, and I didn't have to specify a hostfile. However, setting these for my MPI_Comm_spawn code still resulted in the same segmentation fault. Evan On Tue, Feb 3, 2015 at 10:09 AM, Ralph Castain

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-03 Thread Ralph Castain
If you add the following to your environment, you should run on multiple nodes: OMPI_MCA_rmaps_base_mapping_policy=node OMPI_MCA_orte_default_hostfile= The first tells OMPI to map-by node. The second passes in your default hostfile so you don't need to specify it as an Info key. HTH Ralph On

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-03 Thread Evan Samanas
Hi Ralph, Good to know you've reproduced it. I was experiencing this using both the hostfile and host key. A simple comm_spawn was working for me as well, but it was only launching locally, and I'm pretty sure each node only has 4 slots given past behavior (the mpirun -np 8 example I gave in my

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-03 Thread Ralph Castain
BTW: I've confirmed this only happens if you provide the hostfile info key. A simple comm_spawn without the hostfile key works just fine. On Sun, Feb 1, 2015 at 8:53 PM, Ralph Castain wrote: > Well, I can reproduce it - but I won’t have time to address it until I > return

Re: [OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-02-01 Thread Ralph Castain
Well, I can reproduce it - but I won’t have time to address it until I return later this week. Whether or not procs get spawned onto a remote host depends on the number of local slots. You asked for 8 processes, so if there are more than 8 slots on the node, then it will launch them all on the

[OMPI users] orted seg fault when using MPI_Comm_spawn on more than one host

2015-01-26 Thread Evan
Hi, I am using OpenMPI 1.8.4 on a Ubuntu 14.04 machine and 5 Ubuntu 12.04 machines. I am using ssh to launch MPI jobs and I'm able to run simple programs like 'mpirun -np 8 --host localhost,pachy1 hostname' and get the expected output (pachy1 being an entry in my /etc/hosts file). I