Re: [OMPI users] default num_procs of round_robin_mapper with cpus-per-proc option

2014-01-23 Thread Ralph Castain
On Jan 22, 2014, at 8:08 PM, tmish...@jcity.maeda.co.jp wrote: > > > Thanks, Ralph. > > I have one more question. I'm sorry to ask you many things ... Not a problem > > Could you tell me the difference between "map-by slot" and "map-by core". > From my understanding, slot is the synonym of

Re: [OMPI users] default num_procs of round_robin_mapper with cpus-per-proc option

2014-01-23 Thread tmishima
Thanks for your explanation, Ralph. But it's really subtle to understand for me ... Anyway, I'd like to report what I found through verbose output. "-map-by core" calls "bind in place" as below: [mishima@manage work]$ mpirun -np 4 -hostfile pbs_hosts -report-bindings -cpus-per-proc 4 -map-by co

Re: [OMPI users] problem with rankfile in openmpi-1.7.4rc2r30323

2014-01-23 Thread Siegmar Gross
Dear Ralph, the same problems occur without rankfiles. tyr fd1026 102 which mpicc /usr/local/openmpi-1.7.4_64_cc/bin/mpicc tyr fd1026 103 mpiexec --report-bindings -np 2 \ -host tyr,sunpc1 hostname tyr fd1026 104 /opt/solstudio12.3/bin/sparcv9/dbx \ /usr/local/openmpi-1.7.4_64_cc/bin/mpiexe

Re: [OMPI users] problem with rankfile in openmpi-1.7.4rc2r30323

2014-01-23 Thread Ralph Castain
Okay, so this is a Sparc issue, not a rankfile one. I'm afraid my lack of time and access to that platform will mean this won't get fixed for 1.7.4, but I'll try to take a look at it when time permits. On Jan 22, 2014, at 10:52 PM, Siegmar Gross wrote: > Dear Ralph, > > the same problems oc

[OMPI users] Getting past firewall & something else? in Mac OS X

2014-01-23 Thread Dan Hsu
Hi All Am trying to run a parallel molecular simulation from the directory containing the executable (using only available cores on the local cpus) on Mac Lion and keep getting an apparent firewall error that cannot be resolved. I am entering: ?mpirun -np 2 -e ./mpierr1 dock6.mpi -otherinpu

Re: [OMPI users] Connection timed out with multiple nodes

2014-01-23 Thread Doug Roberts
Date: Fri, 17 Jan 2014 19:24:50 -0800 From: Ralph Castain The most common cause of this problem is a firewall between the nodes - you can ssh across, but not communicate. Have you checked to see that the firewall is turned off? Turns out some iptables rules (typical on our clusters) were act

Re: [OMPI users] Connection timed out with multiple nodes

2014-01-23 Thread Ralph Castain
It's the failure on readv that's the source of the trouble. What happens if you only if_include eth2? Does it work then? On Jan 23, 2014, at 5:38 PM, Doug Roberts wrote: > >> Date: Fri, 17 Jan 2014 19:24:50 -0800 >> From: Ralph Castain >> >> The most common cause of this problem is a firewa