Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Ralph Castain
On Tue, Nov 16, 2010 at 12:23 PM, Terry Dontje wrote: > On 11/16/2010 01:31 PM, Reuti wrote: > > Hi Ralph, > > Am 16.11.2010 um 15:40 schrieb Ralph Castain: > > > 2. have SGE bind procs it launches to -all- of those cores. I believe SGE > does this automatically to

Re: [OMPI users] mpi-io, fortran, going crazy... (ADENDA)

2010-11-16 Thread Gus Correa
Ricardo Reis wrote: and sorry to be such a nuisance... but any motive for an MPI-IO "wall" between the 2.0 and 2.1 Gb? Salve Ricardo Reis! Is this "wall" perhaps the 2GB Linux file size limit on 32-bit systems? Gus (1 mpi process) best, Ricardo Reis 'Non Serviam' PhD candidate

Re: [OMPI users] mpi-io, fortran, going crazy... (ADENDA)

2010-11-16 Thread Ricardo Reis
and sorry to be such a nuisance... but any motive for an MPI-IO "wall" between the 2.0 and 2.1 Gb? (1 mpi process) best, Ricardo Reis 'Non Serviam' PhD candidate @ Lasef Computational Fluid Dynamics, High Performance Computing, Turbulence http://www.lasef.ist.utl.pt Cultural

Re: [OMPI users] mpi-io, fortran, going crazy...

2010-11-16 Thread Ricardo Reis
On my last email... I forgot to add It's a 12Gb machine and the file should be around 2.5Gb I'm using mpirun -np 1 And it writes without problem if I try a file of 250Mb, for instance so it seems also to be a size related problem I'm using the 'native' type for writing... ideas?

[OMPI users] mpi-io, fortran, going crazy...

2010-11-16 Thread Ricardo Reis
Hi all I have been banging my head on a wall for three days trying to make a simple fortran mpi-io program work. I'm using gfortan 4.4, openmpi 1.4.1 (it's a debian box) I have other codes that work with MPI-IO without any problem but for some reason I can't grasp this one... doesn't

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Terry Dontje
On 11/16/2010 01:31 PM, Reuti wrote: Hi Ralph, Am 16.11.2010 um 15:40 schrieb Ralph Castain: 2. have SGE bind procs it launches to -all- of those cores. I believe SGE does this automatically to constrain the procs to running on only those cores. This is another "bug/feature" in SGE: it's a

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Reuti
Hi Ralph, Am 16.11.2010 um 15:40 schrieb Ralph Castain: > > 2. have SGE bind procs it launches to -all- of those cores. I believe SGE > > does this automatically to constrain the procs to running on only those > > cores. > > This is another "bug/feature" in SGE: it's a matter of discussion,

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Chris Jewell
On 16 Nov 2010, at 17:25, Terry Dontje wrote: >>> >> Sure. Here's the stderr of a job submitted to my cluster with 'qsub -pe >> mpi 8 -binding linear:2 myScript.com' where myScript.com runs 'mpirun -mca >> ras_gridengine_verbose 100 --report-bindings ./unterm': >> >> [exec4:17384] System

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Terry Dontje
On 11/16/2010 12:13 PM, Chris Jewell wrote: On 16 Nov 2010, at 14:26, Terry Dontje wrote: In the original case of 7 nodes and processes if we do -binding pe linear:2, and add the -bind-to-core to mpirun I'd actually expect 6 of the nodes processes bind to one core and the 7th node with 2

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Terry Dontje
On 11/16/2010 12:13 PM, Chris Jewell wrote: On 16 Nov 2010, at 14:26, Terry Dontje wrote: In the original case of 7 nodes and processes if we do -binding pe linear:2, and add the -bind-to-core to mpirun I'd actually expect 6 of the nodes processes bind to one core and the 7th node with 2

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Chris Jewell
On 16 Nov 2010, at 14:26, Terry Dontje wrote: > > In the original case of 7 nodes and processes if we do -binding pe linear:2, > and add the -bind-to-core to mpirun I'd actually expect 6 of the nodes > processes bind to one core and the 7th node with 2 processes to have each of > those

[OMPI users] architecture questions

2010-11-16 Thread Hicham Mouline
hello,I currently have a serial application with a GUI that runs some calculations.My next step is to use OpenMPI with the help of the Boost.MPI wrapper library in C++ to parallelize those calculations.There is a set of static data objects created once at startup or loaded from files.1. In

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Terry Dontje
On 11/16/2010 10:59 AM, Reuti wrote: Am 16.11.2010 um 15:26 schrieb Terry Dontje: 1. allocate a specified number of cores on each node to your job this is currently the bug in the "slot<=> core" relation in SGE, which has to be removed, updated or clarified. For now slot and core count

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Reuti
Am 16.11.2010 um 15:26 schrieb Terry Dontje: >>> >>> 1. allocate a specified number of cores on each node to your job >>> >> this is currently the bug in the "slot <=> core" relation in SGE, which has >> to be removed, updated or clarified. For now slot and core count are out of >> sync

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Ralph Castain
Hi Reuti > > 2. have SGE bind procs it launches to -all- of those cores. I believe SGE > does this automatically to constrain the procs to running on only those > cores. > > This is another "bug/feature" in SGE: it's a matter of discussion, whether > the shepherd should get exactly one core (in

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Terry Dontje
On 11/16/2010 09:08 AM, Reuti wrote: Hi, Am 16.11.2010 um 14:07 schrieb Ralph Castain: Perhaps I'm missing it, but it seems to me that the real problem lies in the interaction between SGE and OMPI during OMPI's two-phase launch. The verbose output shows that SGE dutifully allocated the

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Reuti
Hi, Am 16.11.2010 um 14:07 schrieb Ralph Castain: > Perhaps I'm missing it, but it seems to me that the real problem lies in the > interaction between SGE and OMPI during OMPI's two-phase launch. The verbose > output shows that SGE dutifully allocated the requested number of cores on > each

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Ralph Castain
Perhaps I'm missing it, but it seems to me that the real problem lies in the interaction between SGE and OMPI during OMPI's two-phase launch. The verbose output shows that SGE dutifully allocated the requested number of cores on each node. However, OMPI launches only one process on each node (the

Re: [OMPI users] source code for presentation/papers

2010-11-16 Thread Jeff Squyres
We hosted the paper as a courtesy to the author; they aren't part of the Open MPI core community. You should probably contact the author directly to obtain the work; it was not submitted upstream to us. On Nov 13, 2010, at 4:39 AM, Vasiliy G Tolstov wrote: > Hello. I read very good paper

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Reuti
Am 16.11.2010 um 10:26 schrieb Chris Jewell: > Hi all, > >> On 11/15/2010 02:11 PM, Reuti wrote: >>> Just to give my understanding of the problem: >> Sorry, I am still trying to grok all your email as what the problem you >> are trying to solve. So is the issue is trying to have

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Terry Dontje
On 11/16/2010 04:26 AM, Chris Jewell wrote: Hi all, On 11/15/2010 02:11 PM, Reuti wrote: Just to give my understanding of the problem: Sorry, I am still trying to grok all your email as what the problem you are trying to solve. So is the issue is trying to have two jobs having processes on

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-16 Thread Chris Jewell
Hi all, > On 11/15/2010 02:11 PM, Reuti wrote: >> Just to give my understanding of the problem: >>> > Sorry, I am still trying to grok all your email as what the problem you > are trying to solve. So is the issue is trying to have two jobs having > processes on the same node be