from:"Daniel"

[OMPI users] IRIX: unrecognized opcode `leaf(opal_atomic_mb)'

2008-04-26 Thread Daniel

Dear Developers,

I met the same message as Jonathan Day met before,
http://www.open-mpi.org/community/lists/users/2005/09/0138.php

I use 
Irix6.5, 
Openmpi-1.2.6, 
gcc-4.3.0 (gcc g++ gfortran), 
gnu-binutils-2.18, 
and I saw the answer by Mr. Brain is:
-
Gah - shame on me. I let some IRIX-specific stuff slip through. 
Lemme see if I can find an IRIX box and clean that up. The problems 
you listed below are not MIPS 32 / MIPS 64 issues, but the use of 
some nice IRIX-specific macros. By the way, to clarify, the assembly 
has been tested on a MIPS R14K in 64 bit mode (and 32 bit mode using 
SGI's n32 ABI -- it will not work with their o32 ABI). Just not on 
anything other than IRIX ;). 
-

but so far I do not understand what does this answer mean, do I need to change 
some of the codes and add "-n32" option where ld is used?

Please help.
I'd really appreciate your help.

Daniel

--
Below is what I met when I do "make".
--
atomic-asm.s: Assembler messages: 
atomic-asm.s:8: Error: unrecognized opcode `leaf(opal_atomic_mb)' 
atomic-asm.s:13: Error: unrecognized opcode `end(opal_atomic_mb)' 
atomic-asm.s:17: Error: unrecognized opcode `leaf(opal_atomic_rmb)' 
atomic-asm.s:22: Error: unrecognized opcode `end(opal_atomic_rmb)' 
atomic-asm.s:25: Error: unrecognized opcode `leaf(opal_atomic_wmb)' 
atomic-asm.s:30: Error: unrecognized opcode `end(opal_atomic_wmb)' 
atomic-asm.s:33: Error: unrecognized opcode `leaf(opal_atomic_cmpset_32)' 
atomic-asm.s:49: Error: unrecognized opcode `end(opal_atomic_cmpset_32)' 
atomic-asm.s:52: Error: unrecognized opcode `leaf(opal_atomic_cmpset_acq_32)' 
atomic-asm.s:69: Error: unrecognized opcode `end(opal_atomic_cmpset_acq_32)' 
atomic-asm.s:72: Error: unrecognized opcode `leaf(opal_atomic_cmpset_rel_32)' 
atomic-asm.s:89: Error: unrecognized opcode `end(opal_atomic_cmpset_rel_32)' 
atomic-asm.s:92: Error: unrecognized opcode `leaf(opal_atomic_cmpset_64)' 
atomic-asm.s:108: Error: unrecognized opcode `end(opal_atomic_cmpset_64)' 
atomic-asm.s:111: Error: unrecognized opcode `leaf(opal_atomic_cmpset_acq_64)' 
atomic-asm.s:127: Error: unrecognized opcode `end(opal_atomic_cmpset_acq_64)' 
atomic-asm.s:130: Error: unrecognized opcode `leaf(opal_atomic_cmpset_rel_64)' 
atomic-asm.s:147: Error: unrecognized opcode `end(opal_atomic_cmpset_rel_64)' 
make[2]: Leaving directory `/tools/openmpi-1.0a1r7305/opal/asm'

[OMPI users] IRIX Assembler messages unrecognized opcode > `leaf(opal_atomic_mb)

2008-04-26 Thread Daniel

Dear Developers,

I am new to openmpi and IRIX, I used it for parallel computing of a CFD codes.

Attachment is the full version of the error I met! please help, thanks a lot!

The error message looks like this,
> atomic-asm.s: Assembler messages: 
> atomic-asm.s:8: Error: unrecognized opcode > `leaf(opal_atomic_mb)' 


The same question I searched in the mailing list is in 2005, by  Jonathan Day,
http://www.open-mpi.org/community/lists/users/2005/09/0138.php
3 years have past, I wonder why this error is still remained unsolved on IRIX? 
or am I missing something?



Best Regards,
Daniel Venn
--
Tongji Univ.
Shanghai, China

--
IRIX6.5
GNU compiler (gcc g++ gfortran)
GNU binutils (ld etc.)
GNU make
GNU libtools
and so on.

ompi-output.tar.gz
Description: GNU Zip compressed data

Re: [O-MPI users] Question about support for finding MPI processes from a tool

2005-08-05 Thread David Daniel


On Aug 5, 2005, at 7:05 AM, Jeff Squyres wrote:


On Aug 5, 2005, at 8:43 AM, Jim Galarowicz wrote:



We are in the process of implementing a portion of the Etnus MPI
interface to attach to all the processes of a MPI job.  The interface
does automatic process
acquisition via the MPIR_Breakpoint based method that is described in
the Finding Processes section of this URL:
http://www-unix.mcs.anl.gov/mpi/mpi-debug/

We were wondering if your project will use this interface or if you
are supplying an alternative method.



Yes, Open MPI will support the Etnus (MPIR_Breakpoint) interface to
automatically attach to parallel jobs.


We will also consider supporting other interfaces... if publicly  
documented.


David
--
David Daniel 
Advanced Computing Laboratory, LANL, MS-B287, Los Alamos NM 87545, USA

[OMPI users] File locking in ADIO, OpenMPI 1.6.4

2014-04-08 Thread Daniel Milroy

Hello,

Recently a couple of our users have experienced difficulties with compute jobs 
failing with OpenMPI 1.6.4 compiled against GCC 4.7.2, with the nodes running 
kernel 2.6.32-279.5.2.el6.x86_64.  The error is:

File locking failed in ADIOI_Set_lock(fd 7,cmd F_SETLKW/7,type F_WRLCK/1,whence 
0) with return value  and errno 26.
- If the file system is NFS, you need to use NFS version 3, ensure that the 
lockd daemon is running on all the machines, and mount the directory with the 
'noac' option (no attribute caching).
- If the file system is LUSTRE, ensure that the directory is mounted with the 
'flock' option.
ADIOI_Set_lock:: Function not implemented
ADIOI_Set_lock:offset 0, length 8

The file system in question is indeed Lustre, and mounting with flock isn't 
possible in our environment.  I recommended the following changes to the users' 
code:

MPI_Info_set(info, "collective_buffering", "true");
MPI_Info_set(info, "romio_lustre_ds_in_coll", "disable");
MPI_Info_set(info, "romio_ds_read", "disable");
MPI_Info_set(info, "romio_ds_write", "disable");

Which results in the same error as before.  Are there any other MPI options I 
can set?


Thank you in advance for any advice,

Dan Milroy

Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

2014-04-14 Thread Daniel Milroy

Hello Jeff,

I will pass your recommendation to the users and apprise you when I receive a 
response.


Thank you,

Dan Milroy

-Original Message-
From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres 
(jsquyres)
Sent: Friday, April 11, 2014 6:45 AM
To: Open MPI Users
Subject: Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

Sorry for the delay in replying.

Can you try upgrading to Open MPI 1.8, which was released last week?  We 
refreshed the version of ROMIO that is included in OMPI 1.8 vs. 1.6.


On Apr 8, 2014, at 6:49 PM, Daniel Milroy  wrote:

> Hello,
>  
> Recently a couple of our users have experienced difficulties with compute 
> jobs failing with OpenMPI 1.6.4 compiled against GCC 4.7.2, with the nodes 
> running kernel 2.6.32-279.5.2.el6.x86_64.  The error is:
>  
> File locking failed in ADIOI_Set_lock(fd 7,cmd F_SETLKW/7,type 
> F_WRLCK/1,whence 0) with return value  and errno 26.
> - If the file system is NFS, you need to use NFS version 3, ensure that the 
> lockd daemon is running on all the machines, and mount the directory with the 
> 'noac' option (no attribute caching).
> - If the file system is LUSTRE, ensure that the directory is mounted with the 
> 'flock' option.
> ADIOI_Set_lock:: Function not implemented ADIOI_Set_lock:offset 0, 
> length 8
>  
> The file system in question is indeed Lustre, and mounting with flock isn't 
> possible in our environment.  I recommended the following changes to the 
> users' code:
>  
> MPI_Info_set(info, "collective_buffering", "true"); MPI_Info_set(info, 
> "romio_lustre_ds_in_coll", "disable"); MPI_Info_set(info, 
> "romio_ds_read", "disable"); MPI_Info_set(info, "romio_ds_write", 
> "disable");
>  
> Which results in the same error as before.  Are there any other MPI options I 
> can set?
>  
>  
> Thank you in advance for any advice,
>  
> Dan Milroy
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] File locking in ADIO, OpenMPI 1.6.4

2014-04-15 Thread Daniel Milroy

Hi Rob,

The applications of the two users in question are different; I haven¹t
looked through much of either code.  I can respond to your highlighted
situations in sequence:

>- everywhere in NFS.  If you have a Lustre file system exported to some
>clients as NFS, you'll get NFS (er, that might not be true unless you
>pick up a recent patch)
The compute nodes are Lustre clients mounting the file system via IB.

>- note: you don't need to disable data sieving for reads, though you
>might want to if the data sieving algorithm is wasting a lot of data.
That¹s good to know, though given the applications I can¹t say whether
data sieving is wasting data.

>- if atomic mode was set on the file (i.e. you called
>MPI_File_set_atomicity)
>- if you use any of the shared file pointer operations
>- if you use any of the ordered mode collective operations
I don¹t know but will pass these questions on to the users.



Thank you,

Dan Milroy




On 4/14/14, 2:23 PM, "Rob Latham"  wrote:

>
>
>On 04/08/2014 05:49 PM, Daniel Milroy wrote:
>> Hello,
>>
>> The file system in question is indeed Lustre, and mounting with flock
>> isn¹t possible in our environment.  I recommended the following changes
>> to the users¹ code:
>
>Hi.  I'm the ROMIO guy, though I do rely on the community to help me
>keep the lustre driver up to snuff.
>
>> MPI_Info_set(info, "collective_buffering", "true");
>> MPI_Info_set(info, "romio_lustre_ds_in_coll", "disable");
>> MPI_Info_set(info, "romio_ds_read", "disable");
>> MPI_Info_set(info, "romio_ds_write", "disable");
>>
>> Which results in the same error as before.  Are there any other MPI
>> options I can set?
>
>I'd like to hear more about the workload generating these lock messages,
>but I can tell you the situations in which ADIOI_SetLock gets called:
>- everywhere in NFS.  If you have a Lustre file system exported to some
>clients as NFS, you'll get NFS (er, that might not be true unless you
>pick up a recent patch)
>- when writing a non-contiguous region in file, unless you disable data
>sieving, as you did above.
>- note: you don't need to disable data sieving for reads, though you
>might want to if the data sieving algorithm is wasting a lot of data.
>- if atomic mode was set on the file (i.e. you called
>MPI_File_set_atomicity)
>- if you use any of the shared file pointer operations
>- if you use any of the ordered mode collective operations
>
>you've turned off data sieving writes, which is what I would have first
>guessed would trigger this lock message.  So I guess you are hitting one
>of the other cases.
>
>==rob
>
>-- 
>Rob Latham
>Mathematics and Computer Science Division
>Argonne National Lab, IL USA
>___
>users mailing list
>us...@open-mpi.org
>http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-04 Thread Daniel Letai


  
  
Did you check shifter?
https://www.nersc.gov/assets/Uploads/cug2015udi.pdf ,
https://www.nersc.gov/assets/Uploads/cug2015udi.pdf ,
http://www.nersc.gov/research-and-development/user-defined-images/ ,
https://github.com/NERSC/shifter

On 06/03/2016 01:58 AM, Rob Nagler
  wrote:


  We would like to use MPI on Docker with arbitrarily
configured clusters (e.g. created with StarCluster or bare
metal). What I'm curious about is if there is a queue manager
that understands Docker, file systems, MPI, and OpenAuth.
JupyterHub does a lot of this, but it doesn't interface with
MPI. Ideally, we'd like users to be able to queue up jobs
directly from JupyterHub.


Currently, we can configure and initiate an MPI-compatible
  Docker cluster running on a VPC using Salt. What's missing is
  the ability to manage a queue of these clusters. Here's a list
  of requirements:



  
JupyterHub users do not have Unix user ids
Containers must be started as a non-root guest user
  (--user)
JupyterHub user's data directory is mounted in container
Data is shared via NFS or other cluster file system
sshd runs in container for MPI as guest user
Results have to be reported back to GitHub user
MPI network must be visible (--net=host)
Queue manager must be compatible with the above
JupyterHub user is not allowed to interact with Docker
  directly
Docker images are user selectable (from an
  approved list)
Jupyter and MPI containers started from same
  image
  
  Know of a system which supports this?


Our code and config are open source, and your feedback
  would be greatly appreciated.


Salt configuration: https://github.com/radiasoft/salt-conf

Container builders: https://github.com/radiasoft/containers/tree/master/radiasoft
Early phase wiki: https://github.com/radiasoft/devops/wiki/DockerMPI


Thanks,

Rob


  
  
  
  
  ___
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2016/06/29355.php

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-06 Thread Daniel Letai


  
  
That's why they have acl in ZoL, no?

just bring up a new filesystem for each container, with acl so only
the owning container can use that fs, and you should be done, no?

To be clear, each container would have to have a unique uid for this
to work, but together with Ralph's idea of a uid pool this would
provide good isolation.
The reason for ZoL filesystems is to ensure isolation as well as the
other benefits of zfs to docker...

Anyway, clusterhq seem to have a nice product called flocker, which
might also be relevant for this.

On 06/06/2016 12:07 PM, John Hearns
  wrote:


  Rob, I am not familair with wakari.io


However what you say about the Unix userid problem is very
  relevant to many 'shared infrastructure' projects and is a
  topic which comes up in discussions about them.
Teh concern there is, as you say, if the managers of the
  system have a global filesystem, with shared datasets, then if
  virtual clusters are created on the shared infrastructure, or
  if containers are used, then if the user have root access they
  can have privileges over the global filesystem.


You are making some very relevant points here.









  On 5 June 2016 at 01:51, Rob Nagler 
wrote:

  
Thanks! SLURM Elastic Computing
  seems like it might do the trick. I need to try it
  out. 


xCAT is interesting, too. It
  seems to be the HPC version of Salt'ed Cobbler. :)  I
  don't know that it's so important for our problem. We
  have a small cluster for testing against the cloud,
  primarily. I could see xCAT being quite powerful for
  large clusters. 


I'm not sure how to explain the
  Unix user id problem other than a gmail account does
  not have a corresponding Unix user id. Nor do you have
  one for your representation on this mailing list. That
  decoupling is important. The actual execution of unix
  processes on behalf of users of gmail, this mailing
  list, etc. run as a Unix single user. That's how
  JupyterHub containers run. When you click "Start
  Server" in JupyterHub, it starts a docker container as
  some system user (uid=1000 in our case), and the
  container is given access to the user's files via a
  Docker volume. The container cannot see any other
  user's files. 


In a typical HPC context, the
  files are all in /home/. The
  "containment" is done by normal Unix file permissions.
  It's very easy, but it doesn't work for web apps as
  described above. Even being able to list all the other
  users on a system (via "ls /home") is a privacy breach
  in a web app.



Rob


  
  
  ___
  users mailing list
  us...@open-mpi.org
  Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
  Link to this post: http://www.open-mpi.org/community/lists/users/2016/06/29369.php

  
  

  
  
  
  
  ___
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2016/06/29377.php

Re: [OMPI users] Docker Cluster Queue Manager

2016-06-07 Thread Daniel Letai


  
  


On 06/06/2016 06:32 PM, Rob Nagler
  wrote:


  
Thanks, John. I sometimes wonder if I'm
  the only one out there with this particular problem.


Ralph, thanks for sticking with me. :)
  Using a pool of uids doesn't really work due to the way
  cgroups/containers works. It also would require changing the
  permissions of all of the user's files, which would create
  issues for Jupyter/Hub's access to the files, which is used
  for in situ monitoring.


Docker does not yet handle uid mapping
  at the container level (1.10 added mappings for the daemon).
  We have solved
this problem by adding a uid/gid switcher at container
  startup for our images. The trick is to change the uid/gid of
  the "container user" with usermod and groupmod. This only
  works, however, with images we provide. I'd like a solution
  that allows us to start arbitrary/unsafe images, relying on
  cgroups to their job.


Gilles, the containers do lock the user
  down, but the problem is that the file system space has to be
  dynamically bound to the containers across the cluster.
  JuptyerHub solves this problem by understanding the concept of
  a user, and providing a hook to change the directory to be
  mounted.


Daniel, we've had bad experiences with
  ZoL. It's allocation algorithm degrades rapidly when the file
  system gets over 80% full. It still is not integrated into
  major distros, which leads to dkms nightmares on system
  upgrades. I don't really see Flocker as helping in this
  regard, because the problem is the scheduler, not the file
  system. We know which directory we have to mount from the
  cluster file system, just need to get the scheduler to allow
  us to mount that with the container that is running slurmd.


  

Any storage with high percentage usage will degrade performance. ZoL
is actually nicer than btrfs in that regard, but xfs does handle low
free space better most of the time.
If you have the memory to spare, and the images are mostly
identical, deduplication (or even better - compression) can help in
that regard.
Regarding integration - that's mostly licensing issues, and not a
reflection of the maturity of the technology itself.
Regarding dkms - use kabi-tracking-kmod
Just my 2 cents.

  
I'll play with Slurm Elastic Compute
  this week to see how it works.


Rob



  
  
  
  
  ___
users mailing list
us...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: http://www.open-mpi.org/community/lists/users/2016/06/29382.php

[OMPI users] Potential developer to reinstate Xgrid support

2010-09-30 Thread Daniel Beatty

Greetings all,
I am working on obtaining a developer or time for myself to work on
restoring support for MPI using Xgrid.  Do we have any documentation on the
Xgrid supporting section of Open-MPI, and could you point out to me what
sections of MPI that were providing the said support in the first place.

Thank you,
Daniel Beatty
Computer Scientist, Detonation Sciences Branch
Code 474300D
2401 E. Pilot Plant Rd. M/S 1109
China Lake, CA 93555
daniel.bea...@navy.mil
(760)939-7097

Re: [OMPI users] Error when using OpenMPI with SGE multiple hosts

2010-11-17 Thread Daniel Gruber

Hi, 

I'm interested in what is expected from OGE/SGE in order to support 
most of your scenarios. First of all the "-binding pe" request is 
not flexible and makes only sense in scenarios when having the 
same architecture on each host, each involved host is 
used exclusively for the job (SGE exclusive job feature) 
and when the same amount of slots is allocated for each 
host (fixed allocation rule). SGE just writes out the 
socket,core tuples (determined on master task host) in 
the pe_hostfile (the same for each host!). SGE does no 
binding itself. Therefore I think we should have a deeper 
look on the more flexible "-binding [set] ". 

1. One qrsh (--inherit) per slot

If a (legacy) parallel application does a qrsh for *each* granted 
slot (regardless if it calls the local host or a remote host) 
this should work out of the box with OGE/SGE with the 
"-binding linear:1" request in OGE tight integration. 
What might confuse here is when doing a "qstat -cb -j " 
just one core is shown as allocated (which is a bug). 
But when having a look on the host level (qstat -F m_topology_inuse) 
the allocated cores can be seen. This should work with 
different allocation rules.

2. One qrsh per host (OpenMPI case)

This should work under following constraints:
- OGE tight integration (control_slaves true)
- fixed allocation schema (allocation_rule N)
Then what is needed is simply call qsub with 
"-binding linear:N". Then the master script on 
the master host and all orted on the remote 
hosts are bound (if there are free cores) to 
N successive cores. Here orted is detecting 
this and binds its threads each to one of the 
detected cores (when the mpi command line parameter 
is present) - right? 

What does not work is having an OGE/SGE allocation_rule
round robin, or fill up. Since the amount of slots 
per host are unknown on submission time and different 
for each host. Am I right that this is currently the 
only drawback when using SGE and OpenMPI?

The next thing in the discussion was the alignment of 
cores and slots. Because the term of "slots" is 
very flexible in SGE/OGE and does not in all cases 
reflect the amount of cores (in case of SMT for example)
a compiled in mapping does not exist at the moment.
What people could do is to enforce suche a mapping 
via JSV scripts, which do the necessary reformulation 
of the request (modify #slots or #cores if necessary).

Did I miss some important points from SGE/OGE point of 
view? 

Cheers

Daniel

Am Dienstag, den 16.11.2010, 18:24 -0700 schrieb Ralph Castain:
> 
> 
> On Tue, Nov 16, 2010 at 12:23 PM, Terry Dontje
>  wrote:
> On 11/16/2010 01:31 PM, Reuti wrote: 
> > Hi Ralph,
> > 
> > Am 16.11.2010 um 15:40 schrieb Ralph Castain:
> > 
> > > > 2. have SGE bind procs it launches to -all- of those cores. I 
> believe SGE does this automatically to constrain the procs to running on only 
> those cores.
> > > This is another "bug/feature" in SGE: it's a matter of 
> discussion, whether the shepherd should get exactly one core (in case you use 
> more than one `qrsh`per node) for each call, or *all* cores assigned (which 
> we need right now, as the processes in Open MPI will be forks of orte 
> daemon). About such a situtation I filled an issue a long time ago and 
> "limit_to_one_qrsh_per_host yes/no" in the PE definition would do (this 
> setting should then also change the core allocation of the master process):
> > > 
> > > http://gridengine.sunsource.net/issues/show_bug.cgi?id=1254
> > > 
> > > I believe this is indeed the crux of the issue
> > fantastic to share the same view.
> > 
> FWIW, I think I agree too.
> 
> > > > 3. tell OMPI to --bind-to-core.
> > > > 
> > > > In other words, tell SGE to allocate a certain number of cores 
> on each node, but to bind each proc to all of them (i.e., don't bind a proc 
> to a specific core). I'm pretty sure that is a standard SGE option today (at 
> least, I know it used to be). I don't believe any patch or devel work is 
> required (to either SGE or OMPI).
> > > When you use a fixed allocation_rule and a matching -binding 
> request it will work today. But any other case won't be distributed in the 
> correct way.
> > > 
> > > Is it possible to not include the -binding request? If SGE is 
> told to use a fixed allocation_rule, and to allocate (for example) 2 
> cores/node, then won't the orted see 
> > > itself bound to two specific cores on each node?
> > When you leave out the -bi

[OMPI users] Segfault on mpirun with OpenMPI 1.4.5rc2

2012-01-31 Thread Daniel Milroy

Hello,

I have built OpenMPI 1.4.5rc2 with Intel 12.1 compilers in an HPC
environment.  We are running RHEL 5, kernel 2.6.18-238 with Intel Xeon
X5660 cpus.  You can find my build options below.  In an effort to
test the OpenMPI build, I compiled "Hello world" with an mpi_init call
in C and Fortran.  Mpirun of both versions on a single node results in
a segfault.  I have attached the pertinent portion of gdb's output of
the "Hello world" core dump.  Submitting a parallel "Hello world" job
to torque results in segfaults across the respective nodes.  However,
if I execute mpirun of C or Fortran "Hello world" following a segfault
the program will exit successfully.  Additionally, if I strace mpirun
on either a single node or on multiple nodes in parallel "Hello world"
runs successfully.  I am unsure how to proceed- any help would be
greatly appreciated.


Thank you in advance,

Dan Milroy


Build options:

source /ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/iccvars.sh intel64
source /ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/ifortvars.sh
intel64
export CC=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/icc
export CXX=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/icpc
export F77=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/ifort
export F90=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/ifort
export FC=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/ifort
./configure --prefix=/openmpi-1.4.5rc2_intel-12.1
--with-tm=/torque-2.5.8/ --enable-shared --enable-static --without-psm


GDB_hello.c_core_dump
Description: Binary data

Re: [OMPI users] Segfault on mpirun with OpenMPI 1.4.5rc2

2012-02-01 Thread Daniel Milroy

Hi Jeff,

Pending further testing, your suggestion seems to have fixed the
issue.  Thank you very much.


Dan Milroy


2012/1/31 Jeff Squyres :
> We have heard reports of failures with the Intel 12.1 compilers.
>
> Can you try with rc4 (that was literally just released) with the 
> --without-memory-manager configure option?
>
>
> On Jan 31, 2012, at 2:19 PM, Daniel Milroy wrote:
>
>> Hello,
>>
>> I have built OpenMPI 1.4.5rc2 with Intel 12.1 compilers in an HPC
>> environment.  We are running RHEL 5, kernel 2.6.18-238 with Intel Xeon
>> X5660 cpus.  You can find my build options below.  In an effort to
>> test the OpenMPI build, I compiled "Hello world" with an mpi_init call
>> in C and Fortran.  Mpirun of both versions on a single node results in
>> a segfault.  I have attached the pertinent portion of gdb's output of
>> the "Hello world" core dump.  Submitting a parallel "Hello world" job
>> to torque results in segfaults across the respective nodes.  However,
>> if I execute mpirun of C or Fortran "Hello world" following a segfault
>> the program will exit successfully.  Additionally, if I strace mpirun
>> on either a single node or on multiple nodes in parallel "Hello world"
>> runs successfully.  I am unsure how to proceed- any help would be
>> greatly appreciated.
>>
>>
>> Thank you in advance,
>>
>> Dan Milroy
>>
>>
>> Build options:
>>
>>        source /ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/iccvars.sh 
>> intel64
>>        source /ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/ifortvars.sh
>> intel64
>>        export CC=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/icc
>>        export CXX=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/icpc
>>        export 
>> F77=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/ifort
>>        export 
>> F90=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/ifort
>>        export FC=/ics_2012.0.032/composer_xe_2011_sp1.6.233/bin/intel64/ifort
>>        ./configure --prefix=/openmpi-1.4.5rc2_intel-12.1
>> --with-tm=/torque-2.5.8/ --enable-shared --enable-static --without-psm
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Segfault on mpirun with OpenMPI 1.4.5rc2

2012-02-01 Thread Daniel Milroy

Hi Götz,

I don't know whether we can implement your suggestion; it is dependent
on the terms of our license with Intel.  I will take this under
advisement.  Thank you very much.


Dan Milroy


2012/2/1 Götz Waschk :
> On Tue, Jan 31, 2012 at 8:19 PM, Daniel Milroy
>  wrote:
>> Hello,
>>
>> I have built OpenMPI 1.4.5rc2 with Intel 12.1 compilers in an HPC
>> environment.  We are running RHEL 5, kernel 2.6.18-238 with Intel Xeon
>> X5660 cpus.  You can find my build options below.  In an effort to
>> test the OpenMPI build, I compiled "Hello world" with an mpi_init call
>> in C and Fortran.  Mpirun of both versions on a single node results in
>> a segfault.  I have attached the pertinent portion of gdb's output of
>> the "Hello world" core dump.
>
> Hi Daniel,
>
> that looks like the problem I had with my intel build of openmpi. I
> could solve it by upgrading the Intel Compiler version to 12.1.2.273:
> % icc -v
> icc version 12.1.2 (gcc version 4.4.5 compatibility)
> % icc -V
> Intel(R) C Intel(R) 64 Compiler XE for applications running on
> Intel(R) 64, Version 12.1 Build 2028
> Copyright (C) 1985-2011 Intel Corporation.  All rights reserved.
>
>
> After a rebuild of the openmpi runtime, the crashes went away. I was
> using openmpi 1.5.3, but you could still have the same problem.
>
> Regards, Götz
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] setsockopt() fails with EINVAL on solaris

2012-07-30 Thread Daniel Junglas

Hi,

I compiled OpenMPI 1.6 on a 64bit Solaris ultrasparc machine.
Compilation and installation worked without a problem. However,
when trying to run an application with mpirun I always faced
this error:

[hostname:14798] [[50433,0],0] rmcast:init: setsockopt() failed on 
MULTICAST_IF
for multicast network xxx.xxx.xxx.xxx interface xxx.xxx.xxx.xxx
Error: Invalid argument (22)
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 825
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 744
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 193
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../openmpi-1.6/orte/mca/rmcast/base/rmcast_base_select.c at line 
56
[hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
../../../../../openmpi-1.6/orte/mca/ess/hnp/ess_hnp_module.c at line 233
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rmcast_base_select failed
  --> Returned value Error (-1) instead of ORTE_SUCCESS


After some digging I found that the following patch seems to fix the
problem (at least the application seems to run correct now):
--- a/orte/mca/rmcast/udp/rmcast_udp.c  Tue Apr  3 16:30:29 2012
+++ b/orte/mca/rmcast/udp/rmcast_udp.c  Mon Jul 30 15:12:02 2012
@@ -936,9 +936,16 @@
 }
 } else {
 /* on the xmit side, need to set the interface */
+void const *addrptr;
 memset(&inaddr, 0, sizeof(inaddr));
 inaddr.sin_addr.s_addr = htonl(chan->interface);
+#ifdef __sun
+addrlen = sizeof(inaddr.sin_addr);
+addrptr = (void *)&inaddr.sin_addr;
+#else
 addrlen = sizeof(struct sockaddr_in);
+addrptr = (void *)&inaddr;
+#endif

 OPAL_OUTPUT_VERBOSE((2, orte_rmcast_base.rmcast_output,
  "setup:socket:xmit interface 
%03d.%03d.%03d.%03d",
@@ -945,7 +952,7 @@
  OPAL_IF_FORMAT_ADDR(chan->interface)));

 if ((setsockopt(target_sd, IPPROTO_IP, IP_MULTICAST_IF, 
-(void *)&inaddr, addrlen)) < 0) {
+addrptr, addrlen)) < 0) {
 opal_output(0, "%s rmcast:init: setsockopt() failed on 
MULTICAST_IF\n"
 "\tfor multicast network %03d.%03d.%03d.%03d 
interface %03d.%03d.%03d.%03d\n\tError: %s (%d)",
 ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
Can anybody confirm that the patch is good/correct? In particular
that the '__sun' part is the right thing to do?

Thanks,

Daniel


smime.p7s
Description: S/MIME Cryptographic Signature

Re: [OMPI users] setsockopt() fails with EINVAL on solaris

2012-07-30 Thread Daniel Junglas

I built from a tarball, not svn. In the VERSION file I have
  svn_r=r26429
Is that the information you asked for?

Daniel

users-boun...@open-mpi.org wrote on 07/30/2012 04:15:45 PM:
> 
> Do you know what r# of 1.6 you were trying to compile?  Is this via 
> the tarball or svn?
> 
> thanks,
> 
> --td
> 
> On 7/30/2012 9:41 AM, Daniel Junglas wrote: 
> Hi,
> 
> I compiled OpenMPI 1.6 on a 64bit Solaris ultrasparc machine.
> Compilation and installation worked without a problem. However,
> when trying to run an application with mpirun I always faced
> this error:
> 
> [hostname:14798] [[50433,0],0] rmcast:init: setsockopt() failed on 
> MULTICAST_IF
> for multicast network xxx.xxx.xxx.xxx interface xxx.xxx.xxx.xxx
> Error: Invalid argument (22)
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 825
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 744
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 193
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../openmpi-1.6/orte/mca/rmcast/base/rmcast_base_select.c at 
line 
> 56
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/ess/hnp/ess_hnp_module.c at line 233
> 
--
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   orte_rmcast_base_select failed
>   --> Returned value Error (-1) instead of ORTE_SUCCESS
> 
> 
> After some digging I found that the following patch seems to fix the
> problem (at least the application seems to run correct now):
> --- a/orte/mca/rmcast/udp/rmcast_udp.c  Tue Apr  3 16:30:29 2012
> +++ b/orte/mca/rmcast/udp/rmcast_udp.c  Mon Jul 30 15:12:02 2012
> @@ -936,9 +936,16 @@
>  }
>  } else {
>  /* on the xmit side, need to set the interface */
> +void const *addrptr;
>  memset(&inaddr, 0, sizeof(inaddr));
>  inaddr.sin_addr.s_addr = htonl(chan->interface);
> +#ifdef __sun
> +addrlen = sizeof(inaddr.sin_addr);
> +addrptr = (void *)&inaddr.sin_addr;
> +#else
>  addrlen = sizeof(struct sockaddr_in);
> +addrptr = (void *)&inaddr;
> +#endif
> 
>  OPAL_OUTPUT_VERBOSE((2, orte_rmcast_base.rmcast_output,
>   "setup:socket:xmit interface 
> %03d.%03d.%03d.%03d",
> @@ -945,7 +952,7 @@
>   OPAL_IF_FORMAT_ADDR(chan->interface)));
> 
>  if ((setsockopt(target_sd, IPPROTO_IP, IP_MULTICAST_IF, 
> -(void *)&inaddr, addrlen)) < 0) {
> +addrptr, addrlen)) < 0) {
>  opal_output(0, "%s rmcast:init: setsockopt() failed on 
> MULTICAST_IF\n"
>  "\tfor multicast network %03d.%03d.%03d.%03d 
> interface %03d.%03d.%03d.%03d\n\tError: %s (%d)",
>  ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
> Can anybody confirm that the patch is good/correct? In particular
> that the '__sun' part is the right thing to do?
> 
> Thanks,
> 
> Daniel

> 

> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> -- 
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com

> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


smime.p7s
Description: S/MIME Cryptographic Signature

Re: [OMPI users] setsockopt() fails with EINVAL on solaris

2012-07-31 Thread Daniel Junglas

Thanks,

configuring with '--enable-mca-no-build=rmcast' did the trick for me.

Daniel

users-boun...@open-mpi.org wrote on 07/30/2012 04:21:13 PM:
> FWIW: the rmcast framework shouldn't be in 1.6. Jeff and I are 
> testing removal and should have it out of there soon.
> 
> Meantime, the best solution is to "--enable-mca-no-build rmcast"
> 
> On Jul 30, 2012, at 7:15 AM, TERRY DONTJE wrote:
> 
> Do you know what r# of 1.6 you were trying to compile?  Is this via 
> the tarball or svn?
> 
> thanks,
> 
> --td
> 
> On 7/30/2012 9:41 AM, Daniel Junglas wrote: 
> Hi,
> 
> I compiled OpenMPI 1.6 on a 64bit Solaris ultrasparc machine.
> Compilation and installation worked without a problem. However,
> when trying to run an application with mpirun I always faced
> this error:
> 
> [hostname:14798] [[50433,0],0] rmcast:init: setsockopt() failed on 
> MULTICAST_IF
> for multicast network xxx.xxx.xxx.xxx interface xxx.xxx.xxx.xxx
> Error: Invalid argument (22)
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 825
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 744
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/rmcast/udp/rmcast_udp.c at line 193
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../openmpi-1.6/orte/mca/rmcast/base/rmcast_base_select.c at 
line 
> 56
> [hostname:14798] [[50433,0],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../openmpi-1.6/orte/mca/ess/hnp/ess_hnp_module.c at line 233
> 
--
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
> 
>   orte_rmcast_base_select failed
>   --> Returned value Error (-1) instead of ORTE_SUCCESS
> 
> 
> After some digging I found that the following patch seems to fix the
> problem (at least the application seems to run correct now):
> --- a/orte/mca/rmcast/udp/rmcast_udp.c  Tue Apr  3 16:30:29 2012
> +++ b/orte/mca/rmcast/udp/rmcast_udp.c  Mon Jul 30 15:12:02 2012
> @@ -936,9 +936,16 @@
>  }
>  } else {
>  /* on the xmit side, need to set the interface */
> +void const *addrptr;
>  memset(&inaddr, 0, sizeof(inaddr));
>  inaddr.sin_addr.s_addr = htonl(chan->interface);
> +#ifdef __sun
> +addrlen = sizeof(inaddr.sin_addr);
> +addrptr = (void *)&inaddr.sin_addr;
> +#else
>  addrlen = sizeof(struct sockaddr_in);
> +addrptr = (void *)&inaddr;
> +#endif
> 
>  OPAL_OUTPUT_VERBOSE((2, orte_rmcast_base.rmcast_output,
>   "setup:socket:xmit interface 
> %03d.%03d.%03d.%03d",
> @@ -945,7 +952,7 @@
>   OPAL_IF_FORMAT_ADDR(chan->interface)));
> 
>  if ((setsockopt(target_sd, IPPROTO_IP, IP_MULTICAST_IF, 
> -(void *)&inaddr, addrlen)) < 0) {
> +addrptr, addrlen)) < 0) {
>  opal_output(0, "%s rmcast:init: setsockopt() failed on 
> MULTICAST_IF\n"
>      "\tfor multicast network %03d.%03d.%03d.%03d 
> interface %03d.%03d.%03d.%03d\n\tError: %s (%d)",
>  ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
> Can anybody confirm that the patch is good/correct? In particular
> that the '__sun' part is the right thing to do?
> 
> Thanks,
> 
> Daniel

> 

> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> -- 
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
> Oracle - Performance Technologies
> 95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com

> 

> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


smime.p7s
Description: S/MIME Cryptographic Signature

[OMPI users] [threads] How to configure Open MPI for thread support

2012-10-08 Thread Daniel Mitchell

Hi everyone,

I'm writing a hybrid parallel program and it seems that unless I
configure Open MPI with --enable-thread-multiple, then MPI_Init_thread
always provides MPI_THREAD_SINGLE, regardless of what I pass for the
required argument.

Does this mean that I have to configure with --enable-thread-multiple
even to use FUNNELED and SERIALIZED threads?

Daniel

[OMPI users] Performance/stability impact of thread support

2012-10-29 Thread Daniel Mitchell

Hi everyone,

I've asked my linux distribution to repackage Open MPI with thread support 
(meaning configure with --enable-thread-multiple). They are willing to do this 
if it won't have any performance/stability hit for Open MPI users who don't 
need thread support (meaning everyone but me, apparently). Does enabling thread 
support impact performance/stability?

Daniel

[OMPI users] mpi problems/many cpus per node

2012-12-14 Thread Daniel Davidson

I have had to cobble together two machines in our rocks cluster without 
using the standard installation, they have efi only bios on them and 
rocks doesnt like that, so it is the only workaround.


Everything works great now, except for one thing.  MPI jobs (openmpi or 
mpich) fail when started from one of these nodes (via qsub or by logging 
in and running the command) if 24 or more processors are needed on 
another system.  However if the originator of the MPI job is the 
headnode or any of the preexisting compute nodes, it works fine.  Right 
now I am guessing ssh client or ulimit problems, but I cannot find any 
difference.  Any help would be greatly appreciated.


compute-2-1 and compute-2-0 are the new nodes

Examples:

This works, prints 23 hostnames from each machine:
[root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -np 46 hostname


This does not work, prints 24 hostnames for compute-2-1
[root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -np 48 hostname


These both work, print 64 hostnames from each node
[root@biocluster ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -np 128 hostname
[root@compute-0-2 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -np 128 hostname


[root@compute-2-1 ~]# ulimit -a
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 16410016
max locked memory   (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files  (-n) 4096
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) unlimited
cpu time   (seconds, -t) unlimited
max user processes  (-u) 1024
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited

[root@compute-2-1 ~]# more /etc/ssh/ssh_config
Host *
CheckHostIP no
ForwardX11  yes
ForwardAgentyes
StrictHostKeyChecking   no
UsePrivilegedPort   no
Protocol2,1

Re: [OMPI users] mpi problems/many cpus per node

2012-12-14 Thread Daniel Davidson

Oddly enough, adding this debugging info, lowered the number of 
processes that can be used down to 42 from 46.  When I run the MPI, it 
fails giving only the information that follows:


[root@compute-2-1 ssh]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 44 --leave-session-attached -mca 
odls_base_verbose 5 hostname
[compute-2-1.local:44374] mca:base:select:( odls) Querying component 
[default]
[compute-2-1.local:44374] mca:base:select:( odls) Query of component 
[default] set priority to 1
[compute-2-1.local:44374] mca:base:select:( odls) Selected component 
[default]
[compute-2-0.local:28950] mca:base:select:( odls) Querying component 
[default]
[compute-2-0.local:28950] mca:base:select:( odls) Query of component 
[default] set priority to 1
[compute-2-0.local:28950] mca:base:select:( odls) Selected component 
[default]

compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local


On 12/14/2012 03:18 PM, Ralph Castain wrote:

It wouldn't be ssh - in both cases, only one ssh is being done to each node (to 
start the local daemon). The only difference is the number of fork/exec's being 
done on each node, and the number of file descriptors being opened to support 
those fork/exec's.

It certainly looks like your limits are high enough. When you say it "fails", 
what do you mean - what error does it report? Try adding:

--leave-session-attached -mca odls_base_verbose 5

to your cmd line - this will report all the local proc launch debug and 
hopefully show you a more detailed error report.


On Dec 14, 2012, at 12:29 PM, Daniel Davidson  wrote:


I have had to cobble together two machines in our rocks cluster without using 
the standard installation, they have efi only bios on them and rocks doesnt 
like that, so it is the only workaround.

Everything works great now, except for one thing.  MPI jobs (openmpi or mpich) 
fail when started from one of these nodes (via qsub or by logging in and 
running the command) if 24 or more processors are needed on another system.  
However if the originator of the MPI job is the headnode or any of the 
preexisting compute nodes, it works fine.  Right now I am guessing ssh client 
or ulimit problems, but I cannot find any difference.  Any help would be 
greatly appreciated.

compute-2-1 and compute-2-0 are the new nodes

Examples:

This works, prints 23 hostnames from each machine:
[root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -np 46 hostname

This does not work, prints 24 hostnames for compute-2-1
[root@compute-2-1 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -np 48 hostname

These both work, print 64 hostnames from each node
[root@biocluster ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -np 128 hostname
[root@compute-0-2 ~]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -np 128 hostname

[root@compute-2-1 ~]# ulimit -a
core file size  (blocks, -c) 0
data seg size   (kbytes, -d) unlimited
scheduling priority (-e) 0
file size   (blocks, -f) unlimited
pending signals (-i) 16410016
max locked memory   (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files  (-n) 4096
pipe size(512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority  (-r) 0
stack size  (kbytes, -s) unlimited
cpu time   (seconds, -t) unlimited
max user processes  (-u) 1024
virtual memory  (kbytes, -v) unlimited
file locks  (-x) unlimited

[root@compute-2-1 ~]# more /etc/ssh/ssh_config
Host *
CheckHostIP no
ForwardX11  yes
ForwardAgentyes
StrictHostKeyChecking   no
UsePrivilegedPort   no
Protocol2,1

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] mpi problems/many cpus per node

2012-12-14 Thread Daniel Davidson

 
abort file /tmp/openmpi-sessions-root@compute-2-1.local_0/3245604865/7/abort
[compute-2-1.local:44855] [[49524,0],0] odls:waitpid_fired child process 
[[49524,1],7] terminated normally
[compute-2-1.local:44855] [[49524,0],0] odls:waitpid_fired checking 
abort file /tmp/openmpi-sessions-root@compute-2-1.local_0/3245604865/5/abort
[compute-2-1.local:44855] [[49524,0],0] odls:waitpid_fired child process 
[[49524,1],5] terminated normally
[compute-2-1.local:44855] [[49524,0],0] odls:waitpid_fired checking 
abort file /tmp/openmpi-sessions-root@compute-2-1.local_0/3245604865/3/abort
[compute-2-1.local:44855] [[49524,0],0] odls:waitpid_fired child process 
[[49524,1],3] terminated normally
[compute-2-1.local:44855] [[49524,0],0] odls:waitpid_fired checking 
abort file /tmp/openmpi-sessions-root@compute-2-1.local_0/3245604865/1/abort
[compute-2-1.local:44855] [[49524,0],0] odls:waitpid_fired child process 
[[49524,1],1] terminated normally

compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],25]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],15]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],11]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],13]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],19]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],9]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],17]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],31]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],7]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],21]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],5]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],33]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],23]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],3]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],29]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],27]
[compute-2-1.local:44855] [[49524,0],0] odls:notify_iof_complete for 
child [[49524,1],1]
[compute-2-1.local:44855] [[49524,0],0] odls:proc_complete reporting all 
procs in [49524,1] terminated

^Cmpirun: killing job...

Killed by signal 2.
[compute-2-1.local:44855] [[49524,0],0] odls:kill_local_proc working on 
WILDCARD



On 12/14/2012 04:11 PM, Ralph Castain wrote:

Sorry - I forgot that you built from a tarball, and so debug isn't enabled by 
default. You need to configure --enable-debug.

On Dec 14, 2012, at 1:52 PM, Daniel Davidson  wrote:


Oddly enough, adding this debugging info, lowered the number of processes that 
can be used down to 42 from 46.  When I run the MPI, it fails giving only the 
information that follows:

[root@compute-2-1 ssh]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 44 --leave-session-attached -mca 
odls_base_verbose 5 hostname
[compute-2-1.local:44374] mca:base:select:( odls) Querying component [default]
[compute-2-1.local:44374] mca:base:select:( odls) Query of component [default] 
set priority to 1
[compute-2-1.local:44374] mca:base:select:( odls) Selected component [default]
[compute-2-0.local:28950] mca:base:select:( odls) Querying component [default]
[compute-2-0.local:28950] mca:base:select:( odls) Query of component [default] 
set priority to 1
[compute-2-0.local:28950] mca:base:select:( odls) Selected component [default]
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local
compute-2-1.local


On 12/14/2012 03:18 PM, Ralph Castain wrote:

It wouldn't be ssh - in both cases, only one ssh is being done to each node (to 
start the local daemon). The only difference is the number of fork/exec's being 
done on each node, and the number of file descriptors being opened to support 
those fork/exec's.

It certainly looks like your limits are high enough. When you say it "fails", 
what do you mean - what error does it report? Try adding:

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson

I will give this a try, but wouldn't that be an issue as well if the 
process was run on the head node or another node?  So long as the mpi 
job is not started on either of these two nodes, it works fine.


Dan

On 12/14/2012 11:46 PM, Ralph Castain wrote:

It must be making contact or ORTE wouldn't be attempting to launch your application's 
procs. Looks more like it never received the launch command. Looking at the code, I 
suspect you're getting caught in a race condition that causes the message to get 
"stuck".

Just to see if that's the case, you might try running this with the 1.7 release 
candidate, or even the developer's nightly build. Both use a different timing 
mechanism intended to resolve such situations.


On Dec 14, 2012, at 2:49 PM, Daniel Davidson  wrote:


Thank you for the help so far.  Here is the information that the debugging 
gives me.  Looks like the daemon on on the non-local node never makes contact.  
If I step NP back two though, it does.

Dan

[root@compute-2-1 etc]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 34 --leave-session-attached -mca 
odls_base_verbose 5 hostname
[compute-2-1.local:44855] mca:base:select:( odls) Querying component [default]
[compute-2-1.local:44855] mca:base:select:( odls) Query of component [default] 
set priority to 1
[compute-2-1.local:44855] mca:base:select:( odls) Selected component [default]
[compute-2-0.local:29282] mca:base:select:( odls) Querying component [default]
[compute-2-0.local:29282] mca:base:select:( odls) Query of component [default] 
set priority to 1
[compute-2-0.local:29282] mca:base:select:( odls) Selected component [default]
[compute-2-1.local:44855] [[49524,0],0] odls:update:daemon:info updating nidmap
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list
[compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list unpacking 
data to launch job [49524,1]
[compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list adding new 
jobdat for job [49524,1]
[compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list unpacking 1 
app_contexts
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],0] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],1] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found 
proc [[49524,1],1] for me!
[compute-2-1.local:44855] adding proc [[49524,1],1] (1) to my local list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],2] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],3] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found 
proc [[49524,1],3] for me!
[compute-2-1.local:44855] adding proc [[49524,1],3] (3) to my local list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],4] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],5] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found 
proc [[49524,1],5] for me!
[compute-2-1.local:44855] adding proc [[49524,1],5] (5) to my local list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],6] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],7] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found 
proc [[49524,1],7] for me!
[compute-2-1.local:44855] adding proc [[49524,1],7] (7) to my local list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],8] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],9] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found 
proc [[49524,1],9] for me!
[compute-2-1.local:44855] adding proc [[49524,1],9] (9) to my local list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],10] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],11] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found 
proc [[49524,1],11] for me!
[compute-2-1.local:44855] adding proc [[49524,1],11] (11) to my local list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],12] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],13] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found 
proc [[49524,1],13] for me!
[compute-2-1.local:44855] adding proc [[49524,1],13] (13) to my local list
[compute-2-1.loca

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson

This looks to be having issues as well, and I cannot get any number of 
processors to give me a different result with the new version.


[root@compute-2-1 /]# /home/apps/openmpi-1.7rc5/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 50 --leave-session-attached -mca 
odls_base_verbose 5 hostname
[compute-2-1.local:69417] mca:base:select:( odls) Querying component 
[default]
[compute-2-1.local:69417] mca:base:select:( odls) Query of component 
[default] set priority to 1
[compute-2-1.local:69417] mca:base:select:( odls) Selected component 
[default]
[compute-2-0.local:24486] mca:base:select:( odls) Querying component 
[default]
[compute-2-0.local:24486] mca:base:select:( odls) Query of component 
[default] set priority to 1
[compute-2-0.local:24486] mca:base:select:( odls) Selected component 
[default]
[compute-2-0.local:24486] [[24939,0],1] odls:kill_local_proc working on 
WILDCARD
[compute-2-0.local:24486] [[24939,0],1] odls:kill_local_proc working on 
WILDCARD
[compute-2-0.local:24486] [[24939,0],1] odls:kill_local_proc working on 
WILDCARD
[compute-2-1.local:69417] [[24939,0],0] odls:kill_local_proc working on 
WILDCARD
[compute-2-1.local:69417] [[24939,0],0] odls:kill_local_proc working on 
WILDCARD


However from the head node:

[root@biocluster openmpi-1.7rc5]# /home/apps/openmpi-1.7rc5/bin/mpirun 
-host compute-2-0,compute-2-1 -v  -np 50  hostname


Displays 25 hostnames from each system.

Thank you again for the help so far,

Dan






On 12/17/2012 08:31 AM, Daniel Davidson wrote:
I will give this a try, but wouldn't that be an issue as well if the 
process was run on the head node or another node?  So long as the mpi 
job is not started on either of these two nodes, it works fine.


Dan

On 12/14/2012 11:46 PM, Ralph Castain wrote:
It must be making contact or ORTE wouldn't be attempting to launch 
your application's procs. Looks more like it never received the 
launch command. Looking at the code, I suspect you're getting caught 
in a race condition that causes the message to get "stuck".


Just to see if that's the case, you might try running this with the 
1.7 release candidate, or even the developer's nightly build. Both 
use a different timing mechanism intended to resolve such situations.



On Dec 14, 2012, at 2:49 PM, Daniel Davidson  
wrote:


Thank you for the help so far.  Here is the information that the 
debugging gives me.  Looks like the daemon on on the non-local node 
never makes contact.  If I step NP back two though, it does.


Dan

[root@compute-2-1 etc]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 34 --leave-session-attached -mca 
odls_base_verbose 5 hostname
[compute-2-1.local:44855] mca:base:select:( odls) Querying component 
[default]
[compute-2-1.local:44855] mca:base:select:( odls) Query of component 
[default] set priority to 1
[compute-2-1.local:44855] mca:base:select:( odls) Selected component 
[default]
[compute-2-0.local:29282] mca:base:select:( odls) Querying component 
[default]
[compute-2-0.local:29282] mca:base:select:( odls) Query of component 
[default] set priority to 1
[compute-2-0.local:29282] mca:base:select:( odls) Selected component 
[default]
[compute-2-1.local:44855] [[49524,0],0] odls:update:daemon:info 
updating nidmap

[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list
[compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list 
unpacking data to launch job [49524,1]
[compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list 
adding new jobdat for job [49524,1]
[compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list 
unpacking 1 app_contexts
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- checking proc [[49524,1],0] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- checking proc [[49524,1],1] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- found proc [[49524,1],1] for me!
[compute-2-1.local:44855] adding proc [[49524,1],1] (1) to my local 
list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- checking proc [[49524,1],2] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- checking proc [[49524,1],3] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- found proc [[49524,1],3] for me!
[compute-2-1.local:44855] adding proc [[49524,1],3] (3) to my local 
list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- checking proc [[49524,1],4] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- checking proc [[49524,1],5] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- found proc [[49524,1],5] for me!
[compute-2-1.local:44855] adding proc [[49524,1],5] (5) to my local 
list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list 
- checking proc [[49524,1],6] on daemon

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson

) Querying component 
[default]
[compute-2-0.local:24659] mca:base:select:( odls) Query of component 
[default] set priority to 1
[compute-2-0.local:24659] mca:base:select:( odls) Selected component 
[default]
[compute-2-0.local:24659] [[32341,0],1] plm:rsh_setup on agent ssh : rsh 
path NULL

[compute-2-0.local:24659] [[32341,0],1] plm:base:receive start comm




On 12/17/2012 10:37 AM, Ralph Castain wrote:

?? That was all the output? If so, then something is indeed quite wrong as it 
didn't even attempt to launch the job.

Try adding -mca plm_base_verbose 5 to the cmd line.

I was assuming you were using ssh as the launcher, but I wonder if you are in 
some managed environment? If so, then it could be that launch from a backend 
node isn't allowed (e.g., on gridengine).

On Dec 17, 2012, at 8:28 AM, Daniel Davidson  wrote:


This looks to be having issues as well, and I cannot get any number of 
processors to give me a different result with the new version.

[root@compute-2-1 /]# /home/apps/openmpi-1.7rc5/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 50 --leave-session-attached -mca 
odls_base_verbose 5 hostname
[compute-2-1.local:69417] mca:base:select:( odls) Querying component [default]
[compute-2-1.local:69417] mca:base:select:( odls) Query of component [default] 
set priority to 1
[compute-2-1.local:69417] mca:base:select:( odls) Selected component [default]
[compute-2-0.local:24486] mca:base:select:( odls) Querying component [default]
[compute-2-0.local:24486] mca:base:select:( odls) Query of component [default] 
set priority to 1
[compute-2-0.local:24486] mca:base:select:( odls) Selected component [default]
[compute-2-0.local:24486] [[24939,0],1] odls:kill_local_proc working on WILDCARD
[compute-2-0.local:24486] [[24939,0],1] odls:kill_local_proc working on WILDCARD
[compute-2-0.local:24486] [[24939,0],1] odls:kill_local_proc working on WILDCARD
[compute-2-1.local:69417] [[24939,0],0] odls:kill_local_proc working on WILDCARD
[compute-2-1.local:69417] [[24939,0],0] odls:kill_local_proc working on WILDCARD

However from the head node:

[root@biocluster openmpi-1.7rc5]# /home/apps/openmpi-1.7rc5/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 50  hostname

Displays 25 hostnames from each system.

Thank you again for the help so far,

Dan






On 12/17/2012 08:31 AM, Daniel Davidson wrote:

I will give this a try, but wouldn't that be an issue as well if the process 
was run on the head node or another node?  So long as the mpi job is not 
started on either of these two nodes, it works fine.

Dan

On 12/14/2012 11:46 PM, Ralph Castain wrote:

It must be making contact or ORTE wouldn't be attempting to launch your application's 
procs. Looks more like it never received the launch command. Looking at the code, I 
suspect you're getting caught in a race condition that causes the message to get 
"stuck".

Just to see if that's the case, you might try running this with the 1.7 release 
candidate, or even the developer's nightly build. Both use a different timing 
mechanism intended to resolve such situations.


On Dec 14, 2012, at 2:49 PM, Daniel Davidson  wrote:


Thank you for the help so far.  Here is the information that the debugging 
gives me.  Looks like the daemon on on the non-local node never makes contact.  
If I step NP back two though, it does.

Dan

[root@compute-2-1 etc]# /home/apps/openmpi-1.6.3/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 34 --leave-session-attached -mca 
odls_base_verbose 5 hostname
[compute-2-1.local:44855] mca:base:select:( odls) Querying component [default]
[compute-2-1.local:44855] mca:base:select:( odls) Query of component [default] 
set priority to 1
[compute-2-1.local:44855] mca:base:select:( odls) Selected component [default]
[compute-2-0.local:29282] mca:base:select:( odls) Querying component [default]
[compute-2-0.local:29282] mca:base:select:( odls) Query of component [default] 
set priority to 1
[compute-2-0.local:29282] mca:base:select:( odls) Selected component [default]
[compute-2-1.local:44855] [[49524,0],0] odls:update:daemon:info updating nidmap
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list
[compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list unpacking 
data to launch job [49524,1]
[compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list adding new 
jobdat for job [49524,1]
[compute-2-1.local:44855] [[49524,0],0] odls:construct_child_list unpacking 1 
app_contexts
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],0] on daemon 1
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking 
proc [[49524,1],1] on daemon 0
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - found 
proc [[49524,1],1] for me!
[compute-2-1.local:44855] adding proc [[49524,1],1] (1) to my local list
[compute-2-1.local:44855] [[49524,0],0] odls:constructing child list - checking

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson

A very long time (15 mintues or so) I finally received the following in 
addition to what I just sent earlier:


[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working on 
WILDCARD
[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working on 
WILDCARD
[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working on 
WILDCARD

[compute-2-1.local:69655] [[32341,0],0] daemon 1 failed with status 1
[compute-2-1.local:69655] [[32341,0],0] plm:base:orted_cmd sending 
orted_exit commands
[compute-2-1.local:69655] [[32341,0],0] odls:kill_local_proc working on 
WILDCARD
[compute-2-1.local:69655] [[32341,0],0] odls:kill_local_proc working on 
WILDCARD


Firewalls are down:

[root@compute-2-1 /]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination
[root@compute-2-0 ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination

On 12/17/2012 11:09 AM, Ralph Castain wrote:

Hmmm...and that is ALL the output? If so, then it never succeeded in sending a 
message back, which leads one to suspect some kind of firewall in the way.

Looking at the ssh line, we are going to attempt to send a message from tnode 
2-0 to node 2-1 on the 10.1.255.226 address. Is that going to work? Anything 
preventing it?


On Dec 17, 2012, at 8:56 AM, Daniel Davidson  wrote:


These nodes have not been locked down yet so that jobs cannot be launched from 
the backend, at least on purpose anyway.  The added logging returns the 
information below:

[root@compute-2-1 /]# /home/apps/openmpi-1.7rc5/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 10 --leave-session-attached -mca 
odls_base_verbose 5 -mca plm_base_verbose 5 hostname
[compute-2-1.local:69655] mca:base:select:(  plm) Querying component [rsh]
[compute-2-1.local:69655] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh 
path NULL
[compute-2-1.local:69655] mca:base:select:(  plm) Query of component [rsh] set 
priority to 10
[compute-2-1.local:69655] mca:base:select:(  plm) Querying component [slurm]
[compute-2-1.local:69655] mca:base:select:(  plm) Skipping component [slurm]. 
Query failed to return a module
[compute-2-1.local:69655] mca:base:select:(  plm) Querying component [tm]
[compute-2-1.local:69655] mca:base:select:(  plm) Skipping component [tm]. 
Query failed to return a module
[compute-2-1.local:69655] mca:base:select:(  plm) Selected component [rsh]
[compute-2-1.local:69655] plm:base:set_hnp_name: initial bias 69655 nodename 
hash 3634869988
[compute-2-1.local:69655] plm:base:set_hnp_name: final jobfam 32341
[compute-2-1.local:69655] [[32341,0],0] plm:rsh_setup on agent ssh : rsh path 
NULL
[compute-2-1.local:69655] [[32341,0],0] plm:base:receive start comm
[compute-2-1.local:69655] mca:base:select:( odls) Querying component [default]
[compute-2-1.local:69655] mca:base:select:( odls) Query of component [default] 
set priority to 1
[compute-2-1.local:69655] mca:base:select:( odls) Selected component [default]
[compute-2-1.local:69655] [[32341,0],0] plm:base:setup_job
[compute-2-1.local:69655] [[32341,0],0] plm:base:setup_vm
[compute-2-1.local:69655] [[32341,0],0] plm:base:setup_vm creating map
[compute-2-1.local:69655] [[32341,0],0] setup:vm: working unmanaged allocation
[compute-2-1.local:69655] [[32341,0],0] using dash_host
[compute-2-1.local:69655] [[32341,0],0] checking node compute-2-0
[compute-2-1.local:69655] [[32341,0],0] adding compute-2-0 to list
[compute-2-1.local:69655] [[32341,0],0] checking node compute-2-1.local
[compute-2-1.local:69655] [[32341,0],0] plm:base:setup_vm add new daemon 
[[32341,0],1]
[compute-2-1.local:69655] [[32341,0],0] plm:base:setup_vm assigning new daemon 
[[32341,0],1] to node compute-2-0
[compute-2-1.local:69655] [[32341,0],0] plm:rsh: launching vm
[compute-2-1.local:69655] [[32341,0],0] plm:rsh: local shell: 0 (bash)
[compute-2-1.local:69655] [[32341,0],0] plm:rsh: assuming same remote shell as 
local shell
[compute-2-1.local:69655] [[32341,0],0] plm:rsh: remote shell: 0 (bash)
[compute-2-1.local:69655] [[32341,0],0] plm:rsh: final template argv:
/usr/bin/ssh  PATH=/home/apps/openmpi-1.7rc5/bin:$PATH ; export PATH ; 
LD_LIBRARY_PATH=/home/apps/openmpi-1.7rc5/lib:$LD_LIBRARY_PATH ; export LD_LIBRARY_PATH ; 
DYLD_LIBRARY_PATH=/home/apps/openmpi-1.7rc5/lib:$DYLD_LIBRARY_PATH ; export DYLD_LIBRARY_PATH ;   
/home/apps/openmpi-1.7rc5/bin/orted -mca ess env -mca orte_ess_jobid 2119499776 -mca orte_ess_vpid 
 -mca orte_ess_num_procs 2 -mca orte_hnp_uri 
"2119499776.0;tcp://10.1.255.226:46314;tcp://172.16.28.94:46314" -mca orte_use_common_port 
0 --tree-spawn -mca oob tcp -mca odls_base_verb

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson

I would also add that scp seems to be creating the file in the /tmp 
directory of compute-2-0, and that /var/log secure is showing ssh 
connections being accepted.  Is there anything in ssh that can limit 
connections that I need to look out for?  My guess is that it is part of 
the client prefs and not the server prefs since I can initiate the mpi 
command from another machine and it works fine, even when it uses 
compute-2-0 and 1.


Dan


[root@compute-2-1 /]# date
Mon Dec 17 15:11:50 CST 2012
[root@compute-2-1 /]# /home/apps/openmpi-1.7rc5/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 10 --leave-session-attached -mca 
odls_base_verbose 5 -mca plm_base_verbose 5 hostname

[compute-2-1.local:70237] mca:base:select:(  plm) Querying component [rsh]
[compute-2-1.local:70237] [[INVALID],INVALID] plm:rsh_lookup on agent 
ssh : rsh path NULL


[root@compute-2-0 tmp]# ls -ltr
total 24
-rw---.  1 rootroot   0 Nov 28 08:42 yum.log
-rw---.  1 rootroot5962 Nov 29 10:50 
yum_save_tx-2012-11-29-10-50SRba9s.yumtx
drwx--.  3 danield danield 4096 Dec 12 14:56 
openmpi-sessions-danield@compute-2-0_0
drwx--.  3 rootroot4096 Dec 13 15:38 
openmpi-sessions-root@compute-2-0_0
drwx--  18 danield danield 4096 Dec 14 09:48 
openmpi-sessions-danield@compute-2-0.local_0
drwx--  44 rootroot4096 Dec 17 15:14 
openmpi-sessions-root@compute-2-0.local_0


[root@compute-2-0 tmp]# tail -10 /var/log/secure
Dec 17 15:13:40 compute-2-0 sshd[24834]: Accepted publickey for root 
from 10.1.255.226 port 49483 ssh2
Dec 17 15:13:40 compute-2-0 sshd[24834]: pam_unix(sshd:session): session 
opened for user root by (uid=0)
Dec 17 15:13:42 compute-2-0 sshd[24834]: Received disconnect from 
10.1.255.226: 11: disconnected by user
Dec 17 15:13:42 compute-2-0 sshd[24834]: pam_unix(sshd:session): session 
closed for user root
Dec 17 15:13:50 compute-2-0 sshd[24851]: Accepted publickey for root 
from 10.1.255.226 port 49484 ssh2
Dec 17 15:13:50 compute-2-0 sshd[24851]: pam_unix(sshd:session): session 
opened for user root by (uid=0)
Dec 17 15:13:55 compute-2-0 sshd[24851]: Received disconnect from 
10.1.255.226: 11: disconnected by user
Dec 17 15:13:55 compute-2-0 sshd[24851]: pam_unix(sshd:session): session 
closed for user root
Dec 17 15:14:01 compute-2-0 sshd[24868]: Accepted publickey for root 
from 10.1.255.226 port 49485 ssh2
Dec 17 15:14:01 compute-2-0 sshd[24868]: pam_unix(sshd:session): session 
opened for user root by (uid=0)







On 12/17/2012 11:16 AM, Daniel Davidson wrote:
A very long time (15 mintues or so) I finally received the following 
in addition to what I just sent earlier:


[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working 
on WILDCARD
[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working 
on WILDCARD
[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working 
on WILDCARD

[compute-2-1.local:69655] [[32341,0],0] daemon 1 failed with status 1
[compute-2-1.local:69655] [[32341,0],0] plm:base:orted_cmd sending 
orted_exit commands
[compute-2-1.local:69655] [[32341,0],0] odls:kill_local_proc working 
on WILDCARD
[compute-2-1.local:69655] [[32341,0],0] odls:kill_local_proc working 
on WILDCARD


Firewalls are down:

[root@compute-2-1 /]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination
[root@compute-2-0 ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination

On 12/17/2012 11:09 AM, Ralph Castain wrote:
Hmmm...and that is ALL the output? If so, then it never succeeded in 
sending a message back, which leads one to suspect some kind of 
firewall in the way.


Looking at the ssh line, we are going to attempt to send a message 
from tnode 2-0 to node 2-1 on the 10.1.255.226 address. Is that going 
to work? Anything preventing it?



On Dec 17, 2012, at 8:56 AM, Daniel Davidson  
wrote:


These nodes have not been locked down yet so that jobs cannot be 
launched from the backend, at least on purpose anyway.  The added 
logging returns the information below:


[root@compute-2-1 /]# /home/apps/openmpi-1.7rc5/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 10 --leave-session-attached -mca 
odls_base_verbose 5 -mca plm_base_verbose 5 hostname
[compute-2-1.local:69655] mca:base:select:(  plm) Querying component 
[rsh]
[compute-2-1.local:69655] [[INVALID],INVALID] plm:rsh_lookup on 
agent ssh : rsh path NULL
[compute-2-1.local:69655] mca:base:select:(  plm) Query of component 
[rsh] set priority to 10
[compute-2-1.local:69655] mca:base:select:(  plm) Querying component 
[slurm]
[compute-2-1.local:69655] mca:base:select:(  plm

Re: [OMPI users] mpi problems/many cpus per node

2012-12-17 Thread Daniel Davidson


Yes, it does.

Dan

[root@compute-2-1 ~]# ssh compute-2-0
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Warning: No xauth data; using fake authentication data for X11 forwarding.
Last login: Mon Dec 17 16:13:00 2012 from compute-2-1.local
[root@compute-2-0 ~]# ssh compute-2-1
Warning: untrusted X11 forwarding setup failed: xauth key data not generated
Warning: No xauth data; using fake authentication data for X11 forwarding.
Last login: Mon Dec 17 16:12:32 2012 from biocluster.local
[root@compute-2-1 ~]#



On 12/17/2012 03:39 PM, Doug Reeder wrote:

Daniel,

Does passwordless ssh work. You need to make sure that it is.

Doug
On Dec 17, 2012, at 2:24 PM, Daniel Davidson wrote:


I would also add that scp seems to be creating the file in the /tmp directory 
of compute-2-0, and that /var/log secure is showing ssh connections being 
accepted.  Is there anything in ssh that can limit connections that I need to 
look out for?  My guess is that it is part of the client prefs and not the 
server prefs since I can initiate the mpi command from another machine and it 
works fine, even when it uses compute-2-0 and 1.

Dan


[root@compute-2-1 /]# date
Mon Dec 17 15:11:50 CST 2012
[root@compute-2-1 /]# /home/apps/openmpi-1.7rc5/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 10 --leave-session-attached -mca 
odls_base_verbose 5 -mca plm_base_verbose 5 hostname
[compute-2-1.local:70237] mca:base:select:(  plm) Querying component [rsh]
[compute-2-1.local:70237] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh 
path NULL

[root@compute-2-0 tmp]# ls -ltr
total 24
-rw---.  1 rootroot   0 Nov 28 08:42 yum.log
-rw---.  1 rootroot5962 Nov 29 10:50 
yum_save_tx-2012-11-29-10-50SRba9s.yumtx
drwx--.  3 danield danield 4096 Dec 12 14:56 
openmpi-sessions-danield@compute-2-0_0
drwx--.  3 rootroot4096 Dec 13 15:38 
openmpi-sessions-root@compute-2-0_0
drwx--  18 danield danield 4096 Dec 14 09:48 
openmpi-sessions-danield@compute-2-0.local_0
drwx--  44 rootroot4096 Dec 17 15:14 
openmpi-sessions-root@compute-2-0.local_0

[root@compute-2-0 tmp]# tail -10 /var/log/secure
Dec 17 15:13:40 compute-2-0 sshd[24834]: Accepted publickey for root from 
10.1.255.226 port 49483 ssh2
Dec 17 15:13:40 compute-2-0 sshd[24834]: pam_unix(sshd:session): session opened 
for user root by (uid=0)
Dec 17 15:13:42 compute-2-0 sshd[24834]: Received disconnect from 10.1.255.226: 
11: disconnected by user
Dec 17 15:13:42 compute-2-0 sshd[24834]: pam_unix(sshd:session): session closed 
for user root
Dec 17 15:13:50 compute-2-0 sshd[24851]: Accepted publickey for root from 
10.1.255.226 port 49484 ssh2
Dec 17 15:13:50 compute-2-0 sshd[24851]: pam_unix(sshd:session): session opened 
for user root by (uid=0)
Dec 17 15:13:55 compute-2-0 sshd[24851]: Received disconnect from 10.1.255.226: 
11: disconnected by user
Dec 17 15:13:55 compute-2-0 sshd[24851]: pam_unix(sshd:session): session closed 
for user root
Dec 17 15:14:01 compute-2-0 sshd[24868]: Accepted publickey for root from 
10.1.255.226 port 49485 ssh2
Dec 17 15:14:01 compute-2-0 sshd[24868]: pam_unix(sshd:session): session opened 
for user root by (uid=0)






On 12/17/2012 11:16 AM, Daniel Davidson wrote:

A very long time (15 mintues or so) I finally received the following in 
addition to what I just sent earlier:

[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working on WILDCARD
[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working on WILDCARD
[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc working on WILDCARD
[compute-2-1.local:69655] [[32341,0],0] daemon 1 failed with status 1
[compute-2-1.local:69655] [[32341,0],0] plm:base:orted_cmd sending orted_exit 
commands
[compute-2-1.local:69655] [[32341,0],0] odls:kill_local_proc working on WILDCARD
[compute-2-1.local:69655] [[32341,0],0] odls:kill_local_proc working on WILDCARD

Firewalls are down:

[root@compute-2-1 /]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination
[root@compute-2-0 ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination

On 12/17/2012 11:09 AM, Ralph Castain wrote:

Hmmm...and that is ALL the output? If so, then it never succeeded in sending a 
message back, which leads one to suspect some kind of firewall in the way.

Looking at the ssh line, we are going to attempt to send a message from tnode 
2-0 to node 2-1 on the 10.1.255.226 address. Is that going to work? Anything 
preventing it?


On Dec 17, 2012, at 8:56 AM, Daniel Davidson  wrote:


These nodes have not been locked down yet so

Re: [OMPI users] mpi problems/many cpus per node

2012-12-19 Thread Daniel Davidson


I figured this out.

ssh was working, but scp was not due to an mtu mismatch between the 
systems.  Adding MTU=1500 to my 
/etc/sysconfig/network-scripts/ifcfg-eth2 fixed the problem.


Dan

On 12/17/2012 04:12 PM, Daniel Davidson wrote:

Yes, it does.

Dan

[root@compute-2-1 ~]# ssh compute-2-0
Warning: untrusted X11 forwarding setup failed: xauth key data not 
generated
Warning: No xauth data; using fake authentication data for X11 
forwarding.

Last login: Mon Dec 17 16:13:00 2012 from compute-2-1.local
[root@compute-2-0 ~]# ssh compute-2-1
Warning: untrusted X11 forwarding setup failed: xauth key data not 
generated
Warning: No xauth data; using fake authentication data for X11 
forwarding.

Last login: Mon Dec 17 16:12:32 2012 from biocluster.local
[root@compute-2-1 ~]#



On 12/17/2012 03:39 PM, Doug Reeder wrote:

Daniel,

Does passwordless ssh work. You need to make sure that it is.

Doug
On Dec 17, 2012, at 2:24 PM, Daniel Davidson wrote:

I would also add that scp seems to be creating the file in the /tmp 
directory of compute-2-0, and that /var/log secure is showing ssh 
connections being accepted.  Is there anything in ssh that can limit 
connections that I need to look out for?  My guess is that it is 
part of the client prefs and not the server prefs since I can 
initiate the mpi command from another machine and it works fine, 
even when it uses compute-2-0 and 1.


Dan


[root@compute-2-1 /]# date
Mon Dec 17 15:11:50 CST 2012
[root@compute-2-1 /]# /home/apps/openmpi-1.7rc5/bin/mpirun -host 
compute-2-0,compute-2-1 -v  -np 10 --leave-session-attached -mca 
odls_base_verbose 5 -mca plm_base_verbose 5 hostname
[compute-2-1.local:70237] mca:base:select:(  plm) Querying component 
[rsh]
[compute-2-1.local:70237] [[INVALID],INVALID] plm:rsh_lookup on 
agent ssh : rsh path NULL


[root@compute-2-0 tmp]# ls -ltr
total 24
-rw---.  1 rootroot   0 Nov 28 08:42 yum.log
-rw---.  1 rootroot5962 Nov 29 10:50 
yum_save_tx-2012-11-29-10-50SRba9s.yumtx
drwx--.  3 danield danield 4096 Dec 12 14:56 
openmpi-sessions-danield@compute-2-0_0
drwx--.  3 rootroot4096 Dec 13 15:38 
openmpi-sessions-root@compute-2-0_0
drwx--  18 danield danield 4096 Dec 14 09:48 
openmpi-sessions-danield@compute-2-0.local_0
drwx--  44 rootroot4096 Dec 17 15:14 
openmpi-sessions-root@compute-2-0.local_0


[root@compute-2-0 tmp]# tail -10 /var/log/secure
Dec 17 15:13:40 compute-2-0 sshd[24834]: Accepted publickey for root 
from 10.1.255.226 port 49483 ssh2
Dec 17 15:13:40 compute-2-0 sshd[24834]: pam_unix(sshd:session): 
session opened for user root by (uid=0)
Dec 17 15:13:42 compute-2-0 sshd[24834]: Received disconnect from 
10.1.255.226: 11: disconnected by user
Dec 17 15:13:42 compute-2-0 sshd[24834]: pam_unix(sshd:session): 
session closed for user root
Dec 17 15:13:50 compute-2-0 sshd[24851]: Accepted publickey for root 
from 10.1.255.226 port 49484 ssh2
Dec 17 15:13:50 compute-2-0 sshd[24851]: pam_unix(sshd:session): 
session opened for user root by (uid=0)
Dec 17 15:13:55 compute-2-0 sshd[24851]: Received disconnect from 
10.1.255.226: 11: disconnected by user
Dec 17 15:13:55 compute-2-0 sshd[24851]: pam_unix(sshd:session): 
session closed for user root
Dec 17 15:14:01 compute-2-0 sshd[24868]: Accepted publickey for root 
from 10.1.255.226 port 49485 ssh2
Dec 17 15:14:01 compute-2-0 sshd[24868]: pam_unix(sshd:session): 
session opened for user root by (uid=0)







On 12/17/2012 11:16 AM, Daniel Davidson wrote:
A very long time (15 mintues or so) I finally received the 
following in addition to what I just sent earlier:


[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc 
working on WILDCARD
[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc 
working on WILDCARD
[compute-2-0.local:24659] [[32341,0],1] odls:kill_local_proc 
working on WILDCARD

[compute-2-1.local:69655] [[32341,0],0] daemon 1 failed with status 1
[compute-2-1.local:69655] [[32341,0],0] plm:base:orted_cmd sending 
orted_exit commands
[compute-2-1.local:69655] [[32341,0],0] odls:kill_local_proc 
working on WILDCARD
[compute-2-1.local:69655] [[32341,0],0] odls:kill_local_proc 
working on WILDCARD


Firewalls are down:

[root@compute-2-1 /]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination
[root@compute-2-0 ~]# iptables -L
Chain INPUT (policy ACCEPT)
target prot opt source   destination

Chain FORWARD (policy ACCEPT)
target prot opt source   destination

Chain OUTPUT (policy ACCEPT)
target prot opt source   destination

On 12/17/2012 11:09 AM, Ralph Castain wrote:
Hmmm...and that is ALL the output? If so, then it never succeeded 
in sending a message back, which leads one to suspect some kind of 
firewall in the way.


Looking at the

[OMPI users] mpirun completes for one user, not for another

2013-02-11 Thread Daniel Fetchinson

Hi folks,

I have a really strange problem: a super simple MPI test program (see
below) runs successfully for all users when executed on 4 processes in
1 node, but hangs for user A and runs successfully for user B when
executed on 8 processes in 2 nodes. The executable used is the same
and the appfile used is also the same for user A and user B. Both
users launch it by

mpirun --app appfile

where the content of 'appfile' is

-np 1 -host node1 -wdir /tmp/test ./test
-np 1 -host node1 -wdir /tmp/test ./test
-np 1 -host node1 -wdir /tmp/test ./test
-np 1 -host node1 -wdir /tmp/test ./test

for the single node run with 4 processes and is replaced by

-np 1 -host node1 -wdir /tmp/test ./test
-np 1 -host node1 -wdir /tmp/test ./test
-np 1 -host node1 -wdir /tmp/test ./test
-np 1 -host node1 -wdir /tmp/test ./test
-np 1 -host node2 -wdir /tmp/test ./test
-np 1 -host node2 -wdir /tmp/test ./test
-np 1 -host node2 -wdir /tmp/test ./test
-np 1 -host node2 -wdir /tmp/test ./test

for the 2-node run with 8 processes. Just to recap, the single node
run works for both user A and user B, but the 2-node run only works
for user B and it hangs for user A. It does respond to Ctrl-C though.
Both users use bash, have set up passwordless ssh, are able to ssh
from node1 to node2 and back, have the same PATH and use the same
'mpirun' executable.

At this point I've run out of ideas what to check and debug because
the setups look really identical. The test program is simply

#include 
#include 

int main( int argc, char **argv )
{
   int node;

   MPI_Init( &argc, &argv );
   MPI_Comm_rank( MPI_COMM_WORLD, &node );

   printf( "First Hello World from Node %d\n", node );
   MPI_Barrier( MPI_COMM_WORLD );
   printf( "Second Hello World from Node %d\n",node );

   MPI_Finalize(  );

   return 0;
}


I also asked both users to compile the test program separately, and
the resulting executable 'test' is the same for both indicating again
that identical gcc, mpicc, etc, is used. Gcc is 4.5.1 and openmpi is
1.5. and the interconnect is infiniband.

I've really run out of ideas what else to compare between user A and B.

Thanks for any hints,
Daniel





-- 
Psss, psss, put it down! - http://www.cafepress.com/putitdown



-- 
Psss, psss, put it down! - http://www.cafepress.com/putitdown

Re: [OMPI users] mpirun completes for one user, not for another

2013-02-11 Thread Daniel Fetchinson

Thanks a lot, this was exactly the problem:

> Make sure that the PATH really is identical between users -- especially for
> non-iteractive logins.  E.g.:
>
> env

Here PATH was correct.

> vs.
>
> ssh othernode env

Here PATH was not correct. The PATH was set in .bash_profile and
apparently in non-interactive logins .bash_profile is not sourced.
Only .bashrc is sourced. So if the PATH is set in .bashrc everything
is fine and the problem went away.

Thanks again,
Daniel


> Also check the LD_LIBRARY_PATH.
>
>
> On Feb 11, 2013, at 7:11 AM, Daniel Fetchinson 
> wrote:
>
>> Hi folks,
>>
>> I have a really strange problem: a super simple MPI test program (see
>> below) runs successfully for all users when executed on 4 processes in
>> 1 node, but hangs for user A and runs successfully for user B when
>> executed on 8 processes in 2 nodes. The executable used is the same
>> and the appfile used is also the same for user A and user B. Both
>> users launch it by
>>
>> mpirun --app appfile
>>
>> where the content of 'appfile' is
>>
>> -np 1 -host node1 -wdir /tmp/test ./test
>> -np 1 -host node1 -wdir /tmp/test ./test
>> -np 1 -host node1 -wdir /tmp/test ./test
>> -np 1 -host node1 -wdir /tmp/test ./test
>>
>> for the single node run with 4 processes and is replaced by
>>
>> -np 1 -host node1 -wdir /tmp/test ./test
>> -np 1 -host node1 -wdir /tmp/test ./test
>> -np 1 -host node1 -wdir /tmp/test ./test
>> -np 1 -host node1 -wdir /tmp/test ./test
>> -np 1 -host node2 -wdir /tmp/test ./test
>> -np 1 -host node2 -wdir /tmp/test ./test
>> -np 1 -host node2 -wdir /tmp/test ./test
>> -np 1 -host node2 -wdir /tmp/test ./test
>>
>> for the 2-node run with 8 processes. Just to recap, the single node
>> run works for both user A and user B, but the 2-node run only works
>> for user B and it hangs for user A. It does respond to Ctrl-C though.
>> Both users use bash, have set up passwordless ssh, are able to ssh
>> from node1 to node2 and back, have the same PATH and use the same
>> 'mpirun' executable.
>>
>> At this point I've run out of ideas what to check and debug because
>> the setups look really identical. The test program is simply
>>
>> #include 
>> #include 
>>
>> int main( int argc, char **argv )
>> {
>> int node;
>>
>> MPI_Init( &argc, &argv );
>> MPI_Comm_rank( MPI_COMM_WORLD, &node );
>>
>> printf( "First Hello World from Node %d\n", node );
>> MPI_Barrier( MPI_COMM_WORLD );
>> printf( "Second Hello World from Node %d\n",node );
>>
>> MPI_Finalize(  );
>>
>> return 0;
>> }
>>
>>
>> I also asked both users to compile the test program separately, and
>> the resulting executable 'test' is the same for both indicating again
>> that identical gcc, mpicc, etc, is used. Gcc is 4.5.1 and openmpi is
>> 1.5. and the interconnect is infiniband.
>>
>> I've really run out of ideas what else to compare between user A and B.
>>
>> Thanks for any hints,
>> Daniel
>>
>>
>>
>>
>>
>> --
>> Psss, psss, put it down! - http://www.cafepress.com/putitdown
>>
>>
>>
>> --
>> Psss, psss, put it down! - http://www.cafepress.com/putitdown
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


-- 
Psss, psss, put it down! - http://www.cafepress.com/putitdown

[OMPI users] Multi-threading support for openib

2013-11-27 Thread Daniel Cámpora

Dear list,

I've gone through several hours of configuring and testing to get a grasp
of the current status for multi-threading support.

I want to use a program with MPI_THREAD_MULTIPLE, over the openib BTL. I'm
using openmpi-1.6.5 and SLC6 (rhel6), for what's worth.

Upon configuring my own openmpi library, if I just
--enable-mpi-thread-multiple, and execute my program with -mca btl openib,
it simply crashes upon openib not supporting MPI_THREAD_MULTIPLE.

I've only started testing with --enable-opal-multi-threads, just in case it
would help me. Configuring with the aforementioned options,
./configure --enable-mpi-thread-multiple --enable-opal-multi-threads

results in a crash whenever executing my program,

$ mpirun -np 2 -mca mca_component_path
/usr/mpi/gcc/openmpi-1.6.5/lib64/openmpi -mca btl openib -mca
btl_openib_warn_default_gid_prefix 0 -mca btl_base_verbose 100 -mca
btl_openib_verbose 100 -machinefile machinefile.labs `pwd`/em_bu_app 2>&1 |
less
--
It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_shmem_base_select failed
  --> Returned value -1 instead of OPAL_SUCCESS
--
[lab14:13672] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file
runtime/orte_init.c at line 79
[lab14:13672] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file orterun.c
at line 694


Several questions related to these. Does --enable-opal-multi-threads have
any impact on the BTL multi-threading support? (If there's more
documentation on what this does I'd be glad to read it).

Is there any additional configuration tag necessary for enabling
opal-multi-threads to work?

Cheers, thanks a lot!

Daniel

-- 
Daniel Hugo Cámpora Pérez
European Organization for Nuclear Research (CERN)
PH LBC, LHCb Online Fellow
e-mail. dcamp...@cern.ch

[OMPI users] valgrind invalid reads for large self-sends using thread_multiple

2014-02-10 Thread Daniel Ibanez

Hello,

I have used OpenMPI in conjunction with Valgrind for a long time now, and
developed a list of suppressions for known false positives over time.

Now I am developing a library for inter-thread communication that is based
on using OpenMPI with MPI_THREAD_MULTIPLE support. I have noticed that
sending large messages from one thread to another in the same process will
cause valgrind to complain about invalid reads. I have narrowed it down to
one function being executed on four threads in one process. Attached is a
tarball containing the error-reproducing program, valgrind suppression
file, and valgrind output.

The strange thing is that the valgrind error message doesn't fit the
pattern of read-after-free or read-past-the-end. I'd like to know the
following:

1) Should I even worry ? The code doesn't crash, only valgrind complains.
Is it a harmless false positive ?
2) If it is an issue, am I using MPI right?
3) If I'm using it right, then what causes this ? some kind of internal
buffering issue ?

Note that I use Issend, so nothing should be freed until its completely
been read (in theory).

Thank you,

-- 

Dan Ibanez


thread_test.tar
Description: Unix tar archive

[OMPI users] Process termination problem

2007-08-16 Thread Daniel Spångberg


Dear Open-MPI user list members,

I am currently having a user with an application where one of the  
MPI-processes die, but the openmpi-system does not kill the rest of the  
application.


Since the mpirun man page states the following I would expect it to take  
care of killing the application if a process exits without calling  
MPI_Finalize:


   Process Termination / Signal Handling
   During  the run of an MPI application, if any rank dies abnormally  
(either exiting before invoking MPI_FINALIZE, or dying as the
   result of a signal), mpirun will print out an error message and  
kill the rest of the MPI application.


The following test program demonstrates the behaviour (program hangs until  
it is killed by the user or batch system):


#include 
#include 
#include 
#include 

#define RANK_DEATH 1

int main(int argc, char **argv)
{
  int rank;
  MPI_Init(&argc,&argv);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);

  sleep(10);
  if (rank==RANK_DEATH)
exit(1);
  sleep(10);
  MPI_Finalize();
  return 0;
}

I have tested this on openmpi 1.2.1 as well as the latest stable 1.2.3. I  
am on Linux x86_64.


Is this a bug, or are there some flags I can use to force the mpirun (or  
orted, or...) to kill the whole MPI program when this happens?


If one of the application processes die from a signal (I have tested SEGV  
and FPE) rather than just exiting the whole application is indeed killed.


Best regards
Daniel Spångberg

Re: [OMPI users] Process termination problem

2007-08-17 Thread Daniel Spångberg


Dear George,

I think that the best way is to call MPI_Abort. However, this forces the  
user to modify the code, which I already have suggested. But their  
application is not calling exit directly, I merely wrote the simplest code  
that demonstrates the problem. Their application is a Fortran program and  
during file IO, when something bad happens, the fortran runtime (pgi)  
calls exit (and sometimes _exit for some reason). The file IO is only done  
in one process. I have told them to try to add ERR=linelo,END=lineno,  
where the code at lineno calls MPI_Abort. This has not happened yet.  
Nevertheless, openmpi does not terminate the application when one of  
processes exits without MPI_Finalize, contrary to the content of mpirun  
man-page. I have currently "solved" the problem by writing a .so that is  
LD_PRELOAD:ed, checking whether MPI_Finalize is indeed called between  
MPI_Init and exit/_exit. I'd rather not keep this "solution" for too long.  
If it is indeed so that the mpirun man-page is wrong and the code right,  
I'd rather push the proper error-handling solution.


Best regards
Daniel Spångberg


On Fri, 17 Aug 2007 18:25:17 +0200, George Bosilca   
wrote:



The MPI standard state that the correct way to abort/kill an MPI
application is using the MPI_Abort function. Except, if you're doing
some kind of fault tolerance stuff, there is no reason to end one of
your MPI processes via exit.

   Thanks,
 george.

On Aug 16, 2007, at 12:04 PM, Daniel Spångberg wrote:


Dear Open-MPI user list members,

I am currently having a user with an application where one of the
MPI-processes die, but the openmpi-system does not kill the rest of
the
application.

Since the mpirun man page states the following I would expect it to
take
care of killing the application if a process exits without calling
MPI_Finalize:

Process Termination / Signal Handling
During  the run of an MPI application, if any rank dies
abnormally
(either exiting before invoking MPI_FINALIZE, or dying as the
result of a signal), mpirun will print out an error message
and
kill the rest of the MPI application.

The following test program demonstrates the behaviour (program
hangs until
it is killed by the user or batch system):

#include 
#include 
#include 
#include 

#define RANK_DEATH 1

int main(int argc, char **argv)
{
   int rank;
   MPI_Init(&argc,&argv);
   MPI_Comm_rank(MPI_COMM_WORLD,&rank);

   sleep(10);
   if (rank==RANK_DEATH)
 exit(1);
   sleep(10);
   MPI_Finalize();
   return 0;
}

I have tested this on openmpi 1.2.1 as well as the latest stable
1.2.3. I
am on Linux x86_64.

Is this a bug, or are there some flags I can use to force the
mpirun (or
orted, or...) to kill the whole MPI program when this happens?

If one of the application processes die from a signal (I have
tested SEGV
and FPE) rather than just exiting the whole application is indeed
killed.

Best regards
Daniel Spångberg
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Process termination problem

2007-08-20 Thread Daniel Spångberg

Dear Sven,

I thought about doing that and experimented a bit as well, but there are  
some problems then: I need to relink the users code, registering an atexit  
function is tricky from the fortran code, and I still need to know whether  
MPI_Finalize (and as it turns out MPI_Init as well, otherwise there's  
problems with things like call system) has been called before my atexit  
routine is called...

Best regards
Daniel

On Mon, 20 Aug 2007 14:37:44 +0200, Sven Stork  wrote:

instead of doing dirty with the library you could try to register a  
cleanup

function with atexit.

Thanks,
  Sven

On Friday 17 August 2007 19:59, Daniel Spångberg wrote:

Dear George,

I think that the best way is to call MPI_Abort. However, this forces the
user to modify the code, which I already have suggested. But their
application is not calling exit directly, I merely wrote the simplest  
code
that demonstrates the problem. Their application is a Fortran program  
and

during file IO, when something bad happens, the fortran runtime (pgi)
calls exit (and sometimes _exit for some reason). The file IO is only  
done

in one process. I have told them to try to add ERR=linelo,END=lineno,
where the code at lineno calls MPI_Abort. This has not happened yet.
Nevertheless, openmpi does not terminate the application when one of
processes exits without MPI_Finalize, contrary to the content of mpirun
man-page. I have currently "solved" the problem by writing a .so that is
LD_PRELOAD:ed, checking whether MPI_Finalize is indeed called between
MPI_Init and exit/_exit. I'd rather not keep this "solution" for too  
long.

If it is indeed so that the mpirun man-page is wrong and the code right,
I'd rather push the proper error-handling solution.

Best regards
Daniel Spångberg

On Fri, 17 Aug 2007 18:25:17 +0200, George Bosilca  

wrote:

> The MPI standard state that the correct way to abort/kill an MPI
> application is using the MPI_Abort function. Except, if you're doing
> some kind of fault tolerance stuff, there is no reason to end one of
> your MPI processes via exit.
>
>    Thanks,
>  george.
>
> On Aug 16, 2007, at 12:04 PM, Daniel Spångberg wrote:
>
>> Dear Open-MPI user list members,
>>
>> I am currently having a user with an application where one of the
>> MPI-processes die, but the openmpi-system does not kill the rest of
>> the
>> application.
>>
>> Since the mpirun man page states the following I would expect it to
>> take
>> care of killing the application if a process exits without calling
>> MPI_Finalize:
>>
>> Process Termination / Signal Handling
>> During  the run of an MPI application, if any rank dies
>> abnormally
>> (either exiting before invoking MPI_FINALIZE, or dying as the
>> result of a signal), mpirun will print out an error message
>> and
>> kill the rest of the MPI application.
>>
>> The following test program demonstrates the behaviour (program
>> hangs until
>> it is killed by the user or batch system):
>>
>> #include 
>> #include 
>> #include 
>> #include 
>>
>> #define RANK_DEATH 1
>>
>> int main(int argc, char **argv)
>> {
>>int rank;
>>MPI_Init(&argc,&argv);
>>MPI_Comm_rank(MPI_COMM_WORLD,&rank);
>>
>>sleep(10);
>>if (rank==RANK_DEATH)
>>  exit(1);
>>sleep(10);
>>MPI_Finalize();
>>return 0;
>> }
>>
>> I have tested this on openmpi 1.2.1 as well as the latest stable
>> 1.2.3. I
>> am on Linux x86_64.
>>
>> Is this a bug, or are there some flags I can use to force the
>> mpirun (or
>> orted, or...) to kill the whole MPI program when this happens?
>>
>> If one of the application processes die from a signal (I have
>> tested SEGV
>> and FPE) rather than just exiting the whole application is indeed
>> killed.
>>
>> Best regards
>> Daniel Spångberg
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] Application using OpenMPI 1.2.3 hangs, error messages in mca_btl_tcp_frag_recv

2007-09-12 Thread Daniel Rozenbaum

Hello,

I'm working on an MPI application for which I recently started using Open MPI 
instead of LAM/MPI. Both with Open MPI and LAM/MPI it mostly runs ok, but 
there're a number of cases under which the application terminates abnormally 
when using LAM/MPI, and hangs when using Open MPI. I haven't been able to 
reduce the example reproducing the problem, so every time it takes about an 
hour of running time before the application hangs. It hangs right before it's 
supposed to end properly. The master and all the slave processes are showing in 
"top" consuming 100% CPU. The application just hangs there like that until I 
interrupt it.

Here's the command line:

orterun --prefix /path/to/openmpi -mca btl tcp,self -x PATH -x LD_LIBRARY_PATH 
--hostfile hostfile1 /path/to/app_executable 

hostfile1:

host1 slots=3
host2 slots=4
host3 slots=4
host4 slots=4
host5 slots=4
host6 slots=4
host7 slots=4
host8 slots=4
host9 slots=4
host10 slots=4
host11 slots=4
host12 slots=4
host13 slots=4
host14 slots=4

Each host is a dual-CPU dual-core Intel box running Red Hat Enterprise Server 4.


I caught the following error messages on app's stderr during the run:

[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: 
readv failed with errno=110
[host8][0,1,29][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv] 
mca_btl_tcp_frag_recv: readv failed with errno=113

[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: 
readv failed with errno=110


Excerpts from strace output, and ompi_info are attached below.
Any advice would be greatly appreciated!
Thanks in advance,
Daniel


strace on the orterun process:

poll([{fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=5, events=POLLIN}, 
{fd=8, events=POLLIN}, {fd=9, events=POLLIN}, {fd=10, events=POLLIN}, {fd=11, 
events=POLLIN}, {fd=12, events=POLLIN}, {fd=13, events=POLLIN}, {fd=14, 
events=POLLIN}, {fd=15, events=POLLIN}, {fd=16, events=POLLIN}, {fd=17, 
events=POLLIN}, {fd=18, events=POLLIN}, {fd=19, events=POLLIN}, {fd=20, 
events=POLLIN}, {fd=0, events=POLLIN}, {fd=21, events=POLLIN}, {fd=22, 
events=POLLIN}, {fd=23, events=POLLIN}, {fd=24, events=POLLIN}, {fd=25, 
events=POLLIN}, {fd=26, events=POLLIN}, {fd=27, events=POLLIN}, {fd=28, 
events=POLLIN}, {fd=29, events=POLLIN}, {fd=30, events=POLLIN}, {fd=31, 
events=POLLIN}, {fd=34, events=POLLIN}, {fd=33, events=POLLIN}, {fd=32, 
events=POLLIN}, {fd=35, events=POLLIN}, ...], 71, 1000) = 0
rt_sigprocmask(SIG_BLOCK, [INT USR1 USR2 TERM CHLD], NULL, 8) = 0
rt_sigaction(SIGCHLD, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
rt_sigaction(SIGTERM, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
rt_sigaction(SIGINT, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
rt_sigaction(SIGUSR1, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
rt_sigaction(SIGUSR2, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
sched_yield()   = 0
rt_sigprocmask(SIG_BLOCK, [INT USR1 USR2 TERM CHLD], NULL, 8) = 0
rt_sigaction(SIGCHLD, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
rt_sigaction(SIGTERM, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
rt_sigaction(SIGINT, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
rt_sigaction(SIGUSR1, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
rt_sigaction(SIGUSR2, {0x2a956c7e70, [INT USR1 USR2 TERM CHLD], 
SA_RESTORER|SA_RESTART, 0x3fdf80c4f0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [INT USR1 USR2 TERM CHLD], NULL, 8) = 0
poll([{fd=6, events=POLLIN}, {fd=7, events=POLLIN}, {fd=5, events=POLL



strace on the master process:

rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|SA_RESTART, 
0x3fdf80c4f0}, NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
rt_sigaction(SIGCHLD, {0x2a972cae70, [CHLD], SA_RESTORER|SA_RESTART, 
0x3fdf80c4f0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}, {fd=7, events=POLLIN}, 
{fd=8, events=POLLIN}, {fd=14, events=POLLIN}, {fd=11, events=POLLIN}, {fd=12, 
events=POLLIN}, {fd=13, events=POLLIN}, {fd=16, events=POLLIN}, {fd=15, 
events=POLLIN}, {fd=20, events=POLLIN}, {fd=21, events=POLLIN}, {fd=22, 
events=POLLIN}, {fd=23, events=POLLIN}, {fd=67, events=POLLIN}, {fd=25, 
events=POLLIN}, {fd=66, events=POLLIN}, {fd=26, events=POLLIN}, {fd=27, 
events=POLLIN}, {fd=28, events=POLLIN}, {fd=29, events=POLLIN}, {fd=30, 
events=POLLIN}, {fd=31, events=POLLIN}, {fd=32, events=POLLIN}, {fd=33, 
events=POLLIN}, {fd=34, events=POLLIN}, {fd=35

Re: [OMPI users] Application using OpenMPI 1.2.3 hangs, error messages in mca_btl_tcp_frag_recv

2007-09-17 Thread Daniel Rozenbaum


Jeff, thanks a lot for taking the time,

I looked into this some more and this could very well be a side effect 
of a problem in my code, maybe a memory violation that messes things up; 
I'm going to valgrind this thing and see what comes up. Most of the time 
the app runs just fine, so I'm not sure if it could also be a problem in 
the MPI messages logic in my code; could be though.


What seems to be happening is this: the code of the server is written in 
such a manner that the server knows how many "responses" it's supposed 
to receive from all the clients, so when all the calculation tasks have 
been distributed, the server enters a loop inside which it calls 
MPI_Waitany on an array of handles until it receives all the results it 
expects. However, from my debug prints it looks like all the clients 
think they've sent all the results they could, and they're now all 
sitting in MPI_Probe, waiting for the server to send out the next 
instruction (which is supposed to contain a message indicating the end 
of the run). So, the server is stuck in MPI_Waitany() while all the 
clients are stuck in MPI_Probe().



I was wondering if you could comment on the "readv failed" messages I'm 
seeing in the server's stderr:


[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv] 
mca_btl_tcp_frag_recv: readv failed with errno=110


I'm seeing a few of these along the server's run, with errno=110 
("Connection timed out" according to the "perl -e 'die$!=errno'" method 
I found in OpenMPI FAQs), and I've also seen errno=113 ("No route to 
host"). Could this mean there's an occasional infrastructure problem? It 
would be strange, as it would then seem that this particular run somehow 
triggers it?.. Could these messages also mean that some messages got 
lost due to these errors, and that's why the server thinks it still has 
some results to receive while the clients think they've sent everything out?


Many thanks,
Daniel



Jeff Squyres wrote:
It sounds like we have a missed corner case of the OMPI run-time not  
cleaning properly.  I know one case like this came up recently (if an  
app calls exit() without calling MPI_FINALIZE, OMPI v1.2.x hangs) and  
Ralph is working on it.


This could well be what is happening here...?

Do you know how your process is exiting?  If a process dies via  
signal, OMPI *should* be seeing that and cleaning up the whole job  
properly.




On Sep 12, 2007, at 10:50 PM, Daniel Rozenbaum wrote:

  

Hello,

I'm working on an MPI application for which I recently started  
using Open MPI instead of LAM/MPI. Both with Open MPI and LAM/MPI  
it mostly runs ok, but there're a number of cases under which the  
application terminates abnormally when using LAM/MPI, and hangs  
when using Open MPI. I haven't been able to reduce the example  
reproducing the problem, so every time it takes about an hour of  
running time before the application hangs. It hangs right before  
it's supposed to end properly. The master and all the slave  
processes are showing in "top" consuming 100% CPU. The application  
just hangs there like that until I interrupt it.


Here's the command line:

orterun --prefix /path/to/openmpi -mca btl tcp,self -x PATH -x  
LD_LIBRARY_PATH --hostfile hostfile1 /path/to/app_executable params>


hostfile1:

host1 slots=3
host2 slots=4
host3 slots=4
host4 slots=4
host5 slots=4
host6 slots=4
host7 slots=4
host8 slots=4
host9 slots=4
host10 slots=4
host11 slots=4
host12 slots=4
host13 slots=4
host14 slots=4

Each host is a dual-CPU dual-core Intel box running Red Hat  
Enterprise Server 4.



I caught the following error messages on app's stderr during the run:

[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]  
mca_btl_tcp_frag_recv: readv failed with errno=110
[host8][0,1,29][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]  
mca_btl_tcp_frag_recv: readv failed with errno=113


[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]  
mca_btl_tcp_frag_recv: readv failed with errno=110



Excerpts from strace output, and ompi_info are attached below.
Any advice would be greatly appreciated!
Thanks in advance,
Daniel




ompi_info --all:


Open MPI: 1.2.3
   Open MPI SVN revision: r15136
Open RTE: 1.2.3
   Open RTE SVN revision: r15136
OPAL: 1.2.3
   OPAL SVN revision: r15136
   MCA backtrace: execinfo (MCA v1.0, API v1.0, Component  
v1.2.3)
  MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component  
v1.2.3)

   MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.3)
   MCA maffinity: first_use (MCA v1.0, API v1.0, Component  
v1.2.3)
   MCA maffinity: libnuma (MCA v1.0, API v1.0, Component  
v1.2.3)

   MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.3)
 MCA installdirs: env (MCA v1.

Re: [OMPI users] Application using OpenMPI 1.2.3 hangs, error messages in mca_btl_tcp_frag_recv

2007-09-19 Thread Daniel Rozenbaum





I'm now running the same experiment under valgrind. It's probably
going to run for a few days, but interestingly what I'm seeing now is
that while running under valgrind's memcheck, the app has been
reporting much more of these "recv failed" errors, and not only on the
server node:

[host1][0,1,0]
[host4][0,1,13]
[host5][0,1,18]
[host8][0,1,30]
[host10][0,1,36]
[host12][0,1,46]

If in the original run I got 3 such messages, in the valgrind'ed run I
got about 45 so far, and the app still has about 75% of the work left.

I'm checking while all this is happening, and all the client processes
are still running, none exited early.

I've been analyzing the debug output in my original experiment, and it
does look like the server never receives any new messages from two of
the clients after the "recv failed" messages appear. If my analysis is
correct, these two clients ran on the same host. It might be the case
then that the messages with the next tasks to execute that the server
attempted to send to these two clients never reached them, or were
never sent. Interesting though that there were two additional clients
on the same host, and those seem to have kept working all along, until
the app got stuck.

Once this valgrind experiment is over, I'll proceed to your other
suggestion about the debug loop on the server side checking for any of
the requests the app is waiting for being MPI_REQUEST_NULL.

Many thanks,
Daniel


Jeff Squyres wrote:

  On Sep 17, 2007, at 11:26 AM, Daniel Rozenbaum wrote:

  
  
What seems to be happening is this: the code of the server is  
written in
such a manner that the server knows how many "responses" it's supposed
to receive from all the clients, so when all the calculation tasks  
have
been distributed, the server enters a loop inside which it calls
MPI_Waitany on an array of handles until it receives all the  
results it
expects. However, from my debug prints it looks like all the clients
think they've sent all the results they could, and they're now all
sitting in MPI_Probe, waiting for the server to send out the next
instruction (which is supposed to contain a message indicating the end
of the run). So, the server is stuck in MPI_Waitany() while all the
clients are stuck in MPI_Probe().

  
  
On the server side, try putting in a debug loop and see if any of the  
requests that your app is waiting for are not MPI_REQUEST_NULL (it's  
not a value of 0 -- you'll need to compare against  
MPI_REQUEST_NULL).  If there are any, see if you can trace backwards  
to see what request it is.

  
  
I was wondering if you could comment on the "readv failed" messages  
I'm
seeing in the server's stderr:

[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=110

I'm seeing a few of these along the server's run, with errno=110
("Connection timed out" according to the "perl -e 'die$!=errno'"  
method
I found in OpenMPI FAQs), and I've also seen errno=113 ("No route to
host"). Could this mean there's an occasional infrastructure  
problem? It
would be strange, as it would then seem that this particular run  
somehow
triggers it?.. Could these messages also mean that some messages got
lost due to these errors, and that's why the server thinks it still  
has
some results to receive while the clients think they've sent  
everything out?

  
  
That is all possible.  Sorry I missed that message in your original  
message -- it's basically a message saying that MPI_COMM_WORLD rank 0  
got a timeout from one of the peers that it shouldn't have.

You're sure that none of your processes are exiting early, right?   
You said they were all waiting in MPI_Probe, but I just wanted to  
double check that they're all still running.

Unfortunately, our error message is not very clear about which host  
it lost the connection with -- after you see that message, do you see  
incoming communications from all the slaves, or only some of them?

Re: [OMPI users] Application using OpenMPI 1.2.3 hangs, error messages in mca_btl_tcp_frag_recv

2007-09-27 Thread Daniel Rozenbaum





Here's some more info on the problem I've been struggling with; my
apologies for the lengthy posts, but I'm a little desperate here :-)

I was able to reduce the size of the experiment that reproduces the
problem, both in terms of input data size and the number of slots in
the cluster. The cluster now consists of 6 slots (5 clients), with two
of the clients running on the same node as the server and three others
on another node. This allowed me to follow Brian's
advice and run the server and all the clients under gdb and make
sure none of the processes terminates (normally or abnormally) when the
server reports the "readv failed" errors; this is indeed the case.

I then followed Jeff's
advice and added a debug loop just prior to the server calling
MPI_Waitany(), identifying the entries in the requests array which are
not
MPI_REQUEST_NULL, and then tracing back these
requests. What I found was the following:

At some point during the run, the server calls MPI_Waitany() on an
array of requests consisting of 96 elements, and gets stuck in it
forever; the only thing that happens at some point thereafter is that
the server reports a couple of "readv failed" errors:

[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=110
[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=110

According to my debug prints, just before that last call to
MPI_Waitany() the array requests[] contains 38 entries which are not
MPI_REQUEST_NULL. Half of these entries correspond to calls to Isend(),
half to Irecv(). Specifically, for example, entries
4,14,24,34,44,54,64,74,84,94 are used for Isend()'s from server to
client #3 (of 5), and entries 5,15,...,95 are used for Irecv() for the
same client.

I traced back what's going on, for instance, with requests[4]. As I
mentioned, it corresponds to a call to MPI_Isend() initiated by the
server to client #3 (of 5). By the time the server gets stuck in
Waitany(), this client has already correctly processed the first
Isend() from master in requests[4], returned its response in
requests[5], and the server received this response properly. After
receiving this response, the server Isend()'s the next task to this
client in requests[4], and this is correctly reflected in "requests[4]
!= MPI_REQUESTS_NULL" just before the last call to Waitany(), but for
some reason this send doesn't seem to go any further.

Looking at all other requests[] corresponding to Isend()'s initiated by
the server to the same client (14,24,...,94), they're all also not
MPI_REQUEST_NULL, and are not going any further either.

One thing that might be important is that the messages the server is
sending to the clients in my experiment are quite large, ranging from
hundreds of Kbytes to several Mbytes, the largest being around 9
Mbytes. The largest messages take place at the beginning of the run and
are processed correctly though.

Also, I ran the same experiment on another cluster that uses slightly
different
hardware and network infrastructure, and could not reproduce the
problem.

Hope at least some of the above makes some sense. Any additional advice
would be greatly appreciated!
Many thanks,
Daniel


Daniel Rozenbaum wrote:

  
  
  I'm now running the same experiment under valgrind. It's probably
going to run for a few days, but interestingly what I'm seeing now is
that while running under valgrind's memcheck, the app has been
reporting much more of these "recv failed" errors, and not only on the
server node:
  
[host1][0,1,0]
[host4][0,1,13]
[host5][0,1,18]
[host8][0,1,30]
[host10][0,1,36]
[host12][0,1,46]
  
If in the original run I got 3 such messages, in the valgrind'ed run I
got about 45 so far, and the app still has about 75% of the work left.
  
I'm checking while all this is happening, and all the client processes
are still running, none exited early.
  
I've been analyzing the debug output in my original experiment, and it
does look like the server never receives any new messages from two of
the clients after the "recv failed" messages appear. If my analysis is
correct, these two clients ran on the same host. It might be the case
then that the messages with the next tasks to execute that the server
attempted to send to these two clients never reached them, or were
never sent. Interesting though that there were two additional clients
on the same host, and those seem to have kept working all along, until
the app got stuck.
  
Once this valgrind experiment is over, I'll proceed to your other
suggestion about the debug loop on the server side checking for any of
the requests the app is waiting for being MPI_REQUEST_NULL.
  
Many thanks,
Daniel
  
  
Jeff Squyres wrote:
  
On Sep 17, 2007, at 11:26 AM, Daniel Rozenbaum wrote:

  

  What seems to be happeni

Re: [OMPI users] Application using OpenMPI 1.2.3 hangs, error messages in mca_btl_tcp_frag_recv

2007-09-28 Thread Daniel Rozenbaum





Good Open MPI gurus,

I've further reduced the size of the experiment that reproduces the
problem. My array of requests now has just 10 entries, and by the time
the server gets stuck in MPI_Waitany(), and three of the clients are
stuck in MPI_Recv(), the array has three unprocessed Isend()'s and
three unprocessed Irecv()'s.

I've upgraded to Open MPI 1.2.4, but this made no difference.

Are there any internal logging or debugging facilities in Open MPI that
would allow me to further track the calls that eventually result in the
error in mca_btl_tcp_frag_recv() ?

Thanks,
Daniel


Daniel Rozenbaum wrote:

  
  Here's some more info on the problem I've been struggling with;
my
apologies for the lengthy posts, but I'm a little desperate here :-)
  
I was able to reduce the size of the experiment that reproduces the
problem, both in terms of input data size and the number of slots in
the cluster. The cluster now consists of 6 slots (5 clients), with two
of the clients running on the same node as the server and three others
on another node. This allowed me to follow Brian's
advice and run the server and all the clients under gdb and make
sure none of the processes terminates (normally or abnormally) when the
server reports the "readv failed" errors; this is indeed the case.
  
I then followed Jeff's
advice and added a debug loop just prior to the server calling
MPI_Waitany(), identifying the entries in the requests array which are
not
MPI_REQUEST_NULL, and then tracing back these
requests. What I found was the following:
  
At some point during the run, the server calls MPI_Waitany() on an
array of requests consisting of 96 elements, and gets stuck in it
forever; the only thing that happens at some point thereafter is that
the server reports a couple of "readv failed" errors:
  
[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=110
[host1][0,1,0][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed with errno=110
  
  According to my debug prints, just before that last call to
MPI_Waitany() the array requests[] contains 38 entries which are not
MPI_REQUEST_NULL. Half of these entries correspond to calls to Isend(),
half to Irecv(). Specifically, for example, entries
4,14,24,34,44,54,64,74,84,94 are used for Isend()'s from server to
client #3 (of 5), and entries 5,15,...,95 are used for Irecv() for the
same client.
  
I traced back what's going on, for instance, with requests[4]. As I
mentioned, it corresponds to a call to MPI_Isend() initiated by the
server to client #3 (of 5). By the time the server gets stuck in
Waitany(), this client has already correctly processed the first
Isend() from master in requests[4], returned its response in
requests[5], and the server received this response properly. After
receiving this response, the server Isend()'s the next task to this
client in requests[4], and this is correctly reflected in "requests[4]
!= MPI_REQUESTS_NULL" just before the last call to Waitany(), but for
some reason this send doesn't seem to go any further.
  
Looking at all other requests[] corresponding to Isend()'s initiated by
the server to the same client (14,24,...,94), they're all also not
MPI_REQUEST_NULL, and are not going any further either.
  
One thing that might be important is that the messages the server is
sending to the clients in my experiment are quite large, ranging from
hundreds of Kbytes to several Mbytes, the largest being around 9
Mbytes. The largest messages take place at the beginning of the run and
are processed correctly though.
  
Also, I ran the same experiment on another cluster that uses slightly
different
hardware and network infrastructure, and could not reproduce the
problem.
  
Hope at least some of the above makes some sense. Any additional advice
would be greatly appreciated!
Many thanks,
Daniel

[OMPI users] MPI_Probe succeeds, but subsequent MPI_Recv gets stuck

2007-10-03 Thread Daniel Rozenbaum





Hi again,

I'm trying to debug the problem I posted
on several times recently; I thought I'd try asking a more focused
question:

I have the following sequence in the client code:

MPI_Status stat;
  ret = MPI_Probe(0, MPI_ANY_TAG, MPI_COMM_WORLD, &stat);
  assert(ret == MPI_SUCCESS);
  ret = MPI_Get_elements(&stat, MPI_BYTE, &count);
  assert(ret == MPI_SUCCESS);
  char *buffer = malloc(count);
  assert(buffer != NULL);
  ret = MPI_Recv((void *)buffer, count, MPI_BYTE, 0, stat.MPI_TAG,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);
  assert(ret == MPI_SUCCESS);
  fprintf(stderr, "MPI_Recv done\n");
  

Each MPI_ call in the lines above is surrounded by debug prints
that print out the client's rank, current time, the action about to be
taken with all its parameters' values, and the action's result. After
the first cycle (receive message from server -- process it -- send
response -- wait for next message) works out as
expected, the next cycle get stuck in MPI_Recv. What I get in my debug
prints is more or less the following:

MPI_Probe(source= 0, tag= MPI_ANY_TAG, comm=
MPI_COMM_WORKD, status= )
  MPI_Probe done, source= 0, tag= 2, error= 0
  MPI_Get_elements(status= , dtype= MPI_BYTE,
count= )
  MPI_Get_elements done, count= 2731776
  MPI_Recv(buf= , count= 2731776, dtype= MPI_BYTE,
src= "" tag= 2, comm= MPI_COMM_WORLD, stat= MPI_STATUS_IGNORE)
  

My question then is this - what would cause MPI_Recv to not return,
after the immediately preceding MPI_Probe and MPI_Get_elements return
properly?

Thanks,
Daniel

Re: [OMPI users] MPI_Probe succeeds, but subsequent MPI_Recv gets stuck

2007-10-18 Thread Daniel Rozenbaum

Unfortunately, so far I haven't even been able to reproduce it on a 
different cluster. Since I had no success getting to the bottom of this 
problem, I've been concentrating my efforts on changing the app so that 
there's no need to send very large messages; I might be able to find 
time later to create a short example that shows the problem.


FWIW, when I was debugging it, I peeked a little into Open MPI code, and 
found that the client's MPI_Recv gets stuck in mca_pml_ob1_recv(), after 
it determines that "recvreq->req_recv.req_base.req_ompi.req_complete == 
false" and calls opal_condition_wait().


Jeff Squyres wrote:

Can you send a short test program that shows this problem, perchance?


On Oct 3, 2007, at 1:41 PM, Daniel Rozenbaum wrote:

  

Hi again,

I'm trying to debug the problem I posted on several times recently;  
I thought I'd try asking a more focused question:


I have the following sequence in the client code:
MPI_Status stat;
ret = MPI_Probe(0, MPI_ANY_TAG, MPI_COMM_WORLD, &stat);
assert(ret == MPI_SUCCESS);
ret = MPI_Get_elements(&stat, MPI_BYTE, &count);
assert(ret == MPI_SUCCESS);
char *buffer = malloc(count);
assert(buffer != NULL);
ret = MPI_Recv((void *)buffer, count, MPI_BYTE, 0, stat.MPI_TAG,  
MPI_COMM_WORLD, MPI_STATUS_IGNORE);

assert(ret == MPI_SUCCESS);
fprintf(stderr, "MPI_Recv done\n");
server>
Each MPI_ call in the lines above is surrounded by debug prints  
that print out the client's rank, current time, the action about to  
be taken with all its parameters' values, and the action's result.  
After the first cycle (receive message from server -- process it --  
send response -- wait for next message) works out as expected, the  
next cycle get stuck in MPI_Recv. What I get in my debug prints is  
more or less the following:
MPI_Probe(source= 0, tag= MPI_ANY_TAG, comm= MPI_COMM_WORKD,  
status= )

MPI_Probe done, source= 0, tag= 2, error= 0
MPI_Get_elements(status= , dtype= MPI_BYTE, count=  
)

MPI_Get_elements done, count= 2731776
MPI_Recv(buf= , count= 2731776, dtype= MPI_BYTE, src= 0,  
tag= 2, comm= MPI_COMM_WORLD, stat= MPI_STATUS_IGNORE)
failed" errors in server's stderr>
My question then is this - what would cause MPI_Recv to not return,  
after the immediately preceding MPI_Probe and MPI_Get_elements  
return properly?


Thanks,
Daniel

Re: [OMPI users] MPI_Probe succeeds, but subsequent MPI_Recv gets stuck

2007-10-18 Thread Daniel Rozenbaum

Yes, a memory bug has been my primary focus due to the not entirely 
consistent nature of this problem; I valgrind'ed the app a number of 
times, to no avail though. Will post again if anything new comes up... 
Thanks!


Jeff Squyres wrote:
Yes, that's the normal progression.  For some reason, OMPI appears to  
have decided that it had not yet received the message.  Perhaps a  
memory bug in your application...?  Have you run it through valgrind,  
or some other memory-checking debugger, perchance?


On Oct 18, 2007, at 12:35 PM, Daniel Rozenbaum wrote:

  

Unfortunately, so far I haven't even been able to reproduce it on a
different cluster. Since I had no success getting to the bottom of  
this
problem, I've been concentrating my efforts on changing the app so  
that

there's no need to send very large messages; I might be able to find
time later to create a short example that shows the problem.

FWIW, when I was debugging it, I peeked a little into Open MPI  
code, and
found that the client's MPI_Recv gets stuck in mca_pml_ob1_recv(),  
after
it determines that "recvreq- 


req_recv.req_base.req_ompi.req_complete ==
  

false" and calls opal_condition_wait().

Jeff Squyres wrote:


Can you send a short test program that shows this problem, perchance?


On Oct 3, 2007, at 1:41 PM, Daniel Rozenbaum wrote:


  

Hi again,

I'm trying to debug the problem I posted on several times recently;
I thought I'd try asking a more focused question:

I have the following sequence in the client code:
MPI_Status stat;
ret = MPI_Probe(0, MPI_ANY_TAG, MPI_COMM_WORLD, &stat);
assert(ret == MPI_SUCCESS);
ret = MPI_Get_elements(&stat, MPI_BYTE, &count);
assert(ret == MPI_SUCCESS);
char *buffer = malloc(count);
assert(buffer != NULL);
ret = MPI_Recv((void *)buffer, count, MPI_BYTE, 0, stat.MPI_TAG,
MPI_COMM_WORLD, MPI_STATUS_IGNORE);
assert(ret == MPI_SUCCESS);
fprintf(stderr, "MPI_Recv done\n");

Each MPI_ call in the lines above is surrounded by debug prints
that print out the client's rank, current time, the action about to
be taken with all its parameters' values, and the action's result.
After the first cycle (receive message from server -- process it --
send response -- wait for next message) works out as expected, the
next cycle get stuck in MPI_Recv. What I get in my debug prints is
more or less the following:
MPI_Probe(source= 0, tag= MPI_ANY_TAG, comm= MPI_COMM_WORKD,
status= )
MPI_Probe done, source= 0, tag= 2, error= 0
MPI_Get_elements(status= , dtype= MPI_BYTE, count=
)
MPI_Get_elements done, count= 2731776
MPI_Recv(buf= , count= 2731776, dtype= MPI_BYTE, src= 0,
tag= 2, comm= MPI_COMM_WORLD, stat= MPI_STATUS_IGNORE)

My question then is this - what would cause MPI_Recv to not return,
after the immediately preceding MPI_Probe and MPI_Get_elements
return properly?

Thanks,
Daniel

[OMPI users] SCALAPACK: Segmentation Fault (11) and Signal code: Address not mapped (1)

2008-01-22 Thread Backlund, Daniel


Hello all, I am using OMPI 1.2.4 on a Linux cluster (Rocks 4.2). OMPI was 
configured to use the 
Pathscale Compiler Suite installed in the (NFS mounted on nodes) 
/home/PROGRAMS/pathscale. I am 
trying to compile and run the example1.f that comes with the ACML package from 
AMD, and I am 
unable to get it to run. All nodes have the same Opteron processors and 2GB ram 
per core. OMPI 
was configured as below.

export CC=pathcc
export CXX=pathCC
export FC=pathf90
export F77=pathf90

./configure --prefix=/opt/openmpi/1.2.4 --enable-static --without-threads 
--without-memory-manager \
  --without-libnuma --disable-mpi-threads

The configuration was successful, the install was successful, I can even run a 
sample mpihello.f90 
program. I would eventually like to link the ACML SCALAPACK and BLACS libraries 
to our code, but I 
need some help. The ACML version is 3.1.0 for pathscale64. I go into the 
scalapack_examples directory, 
modify GNUmakefile to the correct values, and compile successfully. I have made 
openmpi into an rpm and 
pushed it to the nodes, modified LD_LIBRARY_PATH and PATH, and made sure I can 
see it on all nodes. 
When I try to run the example1.exe which is generated, using 
/opt/openmpi/1.2.4/bin/mpirun -np 6 example1.exe
I get the following output:

 example1.res 

[XXX:31295] *** Process received signal ***
[XXX:31295] Signal: Segmentation fault (11)
[XXX:31295] Signal code: Address not mapped (1)
[XXX:31295] Failing at address: 0x4470
[XXX:31295] *** End of error message ***
[XXX:31298] *** Process received signal ***
[XXX:31298] Signal: Segmentation fault (11)
[XXX:31298] Signal code: Address not mapped (1)
[XXX:31298] Failing at address: 0x4470
[XXX:31298] *** End of error message ***
[XXX:31299] *** Process received signal ***
[XXX:31299] Signal: Segmentation fault (11)
[XXX:31299] Signal code: Address not mapped (1)
[XXX:31299] Failing at address: 0x4470
[XXX:31299] *** End of error message ***
[XXX:31300] *** Process received signal ***
[XXX:31300] Signal: Segmentation fault (11)
[XXX:31300] Signal code: Address not mapped (1)
[XXX:31300] Failing at address: 0x4470
[XXX:31300] *** End of error message ***
[XXX:31296] *** Process received signal ***
[XXX:31296] Signal: Segmentation fault (11)
[XXX:31296] Signal code: Address not mapped (1)
[XXX:31296] Failing at address: 0x4470
[XXX:31296] *** End of error message ***
[XXX:31297] *** Process received signal ***
[XXX:31297] Signal: Segmentation fault (11)
[XXX:31297] Signal code: Address not mapped (1)
[XXX:31297] Failing at address: 0x4470
[XXX:31297] *** End of error message ***
mpirun noticed that job rank 0 with PID 31295 on node XXX.ourdomain.com 
exited on signal 11 (Segmentation fault). 
5 additional processes aborted (not shown)

 end example1.res 

Here is the result of ldd example1.exe

 ldd example1.exe 
libmpi_f90.so.0 => /opt/openmpi/1.2.4/lib/libmpi_f90.so.0 
(0x002a9557d000)
libmpi_f77.so.0 => /opt/openmpi/1.2.4/lib/libmpi_f77.so.0 
(0x002a95681000)
libmpi.so.0 => /opt/openmpi/1.2.4/lib/libmpi.so.0 (0x002a957b3000)
libopen-rte.so.0 => /opt/openmpi/1.2.4/lib/libopen-rte.so.0 
(0x002a959fb000)
libopen-pal.so.0 => /opt/openmpi/1.2.4/lib/libopen-pal.so.0 
(0x002a95be7000)
librt.so.1 => /lib64/tls/librt.so.1 (0x003e7cd0)
libnsl.so.1 => /lib64/libnsl.so.1 (0x003e7c20)
libutil.so.1 => /lib64/libutil.so.1 (0x003e79e0)
libmv.so.1 => /home/PROGRAMS/pathscale/lib/3.0/libmv.so.1 
(0x002a95d4d000)
libmpath.so.1 => /home/PROGRAMS/pathscale/lib/3.0/libmpath.so.1 
(0x002a95e76000)
libm.so.6 => /lib64/tls/libm.so.6 (0x003e77a0)
libdl.so.2 => /lib64/libdl.so.2 (0x003e77c0)
libpathfortran.so.1 => 
/home/PROGRAMS/pathscale/lib/3.0/libpathfortran.so.1 (0x002a95f97000)
libc.so.6 => /lib64/tls/libc.so.6 (0x003e7770)
libpthread.so.0 => /lib64/tls/libpthread.so.0 (0x003e7820)
/lib64/ld-linux-x86-64.so.2 (0x003e7680)
 end ldd 

Like I said, the compilation of the example program yields no errors, it just 
will not run. 
Does anybody have any suggestions? Am I doing something wrong?

Re: [OMPI users] flash2.5 with openmpi

2008-01-25 Thread Daniel Pfenniger


Hi,

Brock Palen wrote:
Is anyone using flash with openMPI?  we are here, but when ever it  
tries to write its second checkpoint file it segfaults once it gets  
to 2.2GB always in the same location.


Debugging is a pain as it takes 3 days to get to that point.  Just  
wondering if anyone else has seen this same behavior.


Just to make testing faster you might think reducing the file output
interval (trstrt or nrstrt parameters in flash.par), and decrease the
resolution (lrefine_max) to produce smaller files and to see whether
the problem is related with the file size.

Dan

Re: [OMPI users] SCALAPACK: Segmentation Fault (11) and Signal code:Address not mapped (1)

2008-01-30 Thread Backlund, Daniel


Jeff, thank your for your suggestion, I am sure that the correct mpif.h is 
being included. One 
thing that I did not do in my original message was submit the job to SGE. I did 
that and the 
program still failed with the same seg fault messages.

Below is the output of the job submitted to SGE.

<<< example1.output >>>

[compute-0-1:19367] *** Process received signal ***
[compute-0-5:19650] *** Process received signal ***
[compute-0-3:17571] *** Process received signal ***
[compute-0-1:19366] *** Process received signal ***
[compute-0-1:19366] Signal: Segmentation fault (11)
[compute-0-1:19366] Signal code: Address not mapped (1)
[compute-0-1:19366] Failing at address: 0x4470
[compute-0-1:19366] *** End of error message ***
[compute-0-5:19650] Signal: Segmentation fault (11)
[compute-0-5:19650] Signal code: Address not mapped (1)
[compute-0-5:19650] Failing at address: 0x4470
[compute-0-5:19650] *** End of error message ***
[compute-0-3:17571] Signal: Segmentation fault (11)
[compute-0-3:17571] Signal code: Address not mapped (1)
[compute-0-3:17571] Failing at address: 0x4470
[compute-0-3:17571] *** End of error message ***
[compute-0-1:19367] Signal: Segmentation fault (11)
[compute-0-1:19367] Signal code: Address not mapped (1)
[compute-0-1:19367] Failing at address: 0x4470
[compute-0-1:19367] *** End of error message ***
[compute-0-5:19651] *** Process received signal ***
[compute-0-5:19651] Signal: Segmentation fault (11)
[compute-0-5:19651] Signal code: Address not mapped (1)
[compute-0-5:19651] Failing at address: 0x4470
[compute-0-5:19651] *** End of error message ***
[compute-0-3:17572] *** Process received signal ***
[compute-0-3:17572] Signal: Segmentation fault (11)
[compute-0-3:17572] Signal code: Address not mapped (1)
[compute-0-3:17572] Failing at address: 0x4470
[compute-0-3:17572] *** End of error message ***
[compute-0-1.local:19292] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c at line 275
[compute-0-1.local:19292] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
pls_gridengine_module.c at line 791
[compute-0-1.local:19292] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c 
at line 90
mpirun noticed that job rank 2 with PID 19650 on node compute-0-5.local exited 
on signal 11 (Segmentation fault). 
*** glibc detected *** free(): invalid pointer: 0x00606b80 ***
[compute-0-1.local:19292] ERROR: A daemon on node compute-0-5.local failed to 
start as expected.
[compute-0-1.local:19292] ERROR: There may be more information available from
[compute-0-1.local:19292] ERROR: the 'qstat -t' command on the Grid Engine 
tasks.
[compute-0-1.local:19292] ERROR: If the problem persists, please restart the
[compute-0-1.local:19292] ERROR: Grid Engine PE job
[compute-0-1.local:19292] The daemon received a signal 6 (with core).
[compute-0-1.local:19292] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c at line 188
[compute-0-1.local:19292] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
pls_gridengine_module.c at line 826
--
mpirun was unable to cleanly terminate the daemons for this job. Returned value 
Timeout instead of ORTE_SUCCESS.
--
[compute-0-1.local:19365] OOB: Connection to HNP lost

<<< END example1.output >>>

Is it possible that the ACML libraries are incompatible with linking to my 
version of OMPI? 
Or like Jeff said, maybe it is just a Pathscale bug. I hope not.

Daniel

-Original Message-
From: users-boun...@open-mpi.org on behalf of Backlund, Daniel
Sent: Tue 1/22/2008 3:06 PM
To: us...@open-mpi.org
Subject: [OMPI users] SCALAPACK: Segmentation Fault (11) and Signal 
code:Address not mapped (1)
 

Hello all, I am using OMPI 1.2.4 on a Linux cluster (Rocks 4.2). OMPI was 
configured to use the 
Pathscale Compiler Suite installed in the (NFS mounted on nodes) 
/home/PROGRAMS/pathscale. I am 
trying to compile and run the example1.f that comes with the ACML package from 
AMD, and I am 
unable to get it to run. All nodes have the same Opteron processors and 2GB ram 
per core. OMPI 
was configured as below.

export CC=pathcc
export CXX=pathCC
export FC=pathf90
export F77=pathf90

./configure --prefix=/opt/openmpi/1.2.4 --enable-static --without-threads 
--without-memory-manager \
  --without-libnuma --disable-mpi-threads

The configuration was successful, the install was successful, I can even run a 
sample mpihello.f90 
program. I would eventually like to link the ACML SCALAPACK and BLACS libraries 
to our code, but I 
need some help. The ACML version is 3.1.0 for pathscale64. I go into the 
scalapack_examples directory, 
modify GNUmakefile to the correct values, and compile successfully. I have made 
openmpi into an rpm and 
pushed it to the nodes, modified LD_LIBRARY_PATH and PATH, and made sure I can 
see it on all nodes. 
When I

[OMPI users] MPI_Alltoallv and unknown data send sizes

2008-09-10 Thread Daniel Spångberg


Dear all,

First some background, the real question is at the end of this (longish)  
mail.


I have a problem where I need to exchange data between all processes. The  
data is unevenly distributed and I thought at first I could use  
MPI_Alltoallv to transfer the data. However, in my case, the receivers do  
not know how many data items the senders will send, but it is relatively  
easy to set up so the receiver can figure out the maximum number of items  
the sender will send, so I set the recvcounts to the maximum possible, and  
the sendcounts to the actual number of elements (smaller than recvcounts).


The mpi-forum description (from  
http://www.mpi-forum.org/docs/mpi21-report/node99.htm) describes the  
following:


MPI_ALLTOALLV(sendbuf, sendcounts, sdispls, sendtype, recvbuf, recvcounts,  
rdispls, recvtype, comm)

IN sendbuf  starting address of send buffer (choice)
IN sendcounts	integer array equal to the group size specifying the number  
of elements to send to each processor
IN sdispls	integer array (of length group size). Entry j specifies the  
displacement (relative to sendbuf) from which to take the outgoing data  
destined for process j

IN sendtype data type of send buffer elements (handle)
OUT recvbuf address of receive buffer (choice)
IN recvcounts	integer array equal to the group size specifying the number  
of elements that can be received from each processor
IN rdispls	integer array (of length group size). Entry i specifies the  
displacement (relative to recvbuf) at which to place the incoming data  
from process i

IN recvtype data type of receive buffer elements (handle)
IN comm communicator (handle)

In particular the wording is "the number of elements that can be received  
from each processor" for recvcounts, and does not say that this must be  
exactly the same as the number of elements sent.


It also mentions that it should work similarly as a number of independent  
MPI_Send/MPI_Recv calls. The amount of data sent in such a case does not  
need to exactly match the amount of data received.


I, unfortunately, missed the following:

The type signature associated with sendcounts[j], sendtypes[j] at process  
i must be equal to the type signature associated with recvcounts[i],  
recvtypes[i] at process j. This implies that the amount of data sent must  
be equal to the amount of data received, pairwise between every pair of  
processes. Distinct type maps between sender and receiver are still  
allowed.


And the openmpi man page shows
   When a pair of processes exchanges data, each may pass  different   
ele-
   ment  count  and datatype arguments so long as the sender specifies  
the
   same amount of data to send (in  bytes)  as  the  receiver   
expects  to

   receive.

I did test my program on different send/recv counts, and while it  
sometimes works, sometimes it does not. Even if it worked I would not be  
comfortable using it anyway.



The question is: If there is no way of determining the length of the data  
sent by the sender on the receiving end, I see two options: Either always  
transmit too much data using MPI_Alltoall(v) or cook up my own routine  
based on PTP calls, probably MPI_Sendrecv is the best option. Am I missing  
something?


--
Daniel Spångberg
Materials Chemistry
Uppsala University
Sweden

Re: [OMPI users] MPI_Alltoallv and unknown data send sizes

2008-09-10 Thread Daniel Spångberg


George, thanks for the quick answer!

I thought about using alltoall before the alltoallv, but it "feels" like  
this might end up slow having two alltoall, at least doubling the latency.  
Might still be faster than a large bunch of sendrecvs of course. I'll  
simply have to do some short tests, anyway if it turns out the  
alltoall/alltoallv combo is too slow.


Thanks again!
Daniel

Den 2008-09-10 17:10:06 skrev George Bosilca :


Daniel,

Your understanding of he MPI standard requirement with regard to  
MPI_Alltoallv is now 100% accurate. The send count and datatype should  
match what the receiver expect. You can always use an MPI_Alltoall  
before the MPI_Alltoallv to exchange the lengths that you expect.


   george.

On Sep 10, 2008, at 1:46 PM, Daniel Spångberg wrote:


Dear all,

First some background, the real question is at the end of this  
(longish) mail.


I have a problem where I need to exchange data between all processes.  
The data is unevenly distributed and I thought at first I could use  
MPI_Alltoallv to transfer the data. However, in my case, the receivers  
do not know how many data items the senders will send, but it is  
relatively easy to set up so the receiver can figure out the maximum  
number of items the sender will send, so I set the recvcounts to the  
maximum possible, and the sendcounts to the actual number of elements  
(smaller than recvcounts).


The mpi-forum description (from  
http://www.mpi-forum.org/docs/mpi21-report/node99.htm) describes the  
following:


MPI_ALLTOALLV(sendbuf, sendcounts, sdispls, sendtype, recvbuf,  
recvcounts, rdispls, recvtype, comm)

IN sendbuf  starting address of send buffer (choice)
IN sendcounts	integer array equal to the group size specifying the  
number of elements to send to each processor
IN sdispls	integer array (of length group size). Entry j specifies the  
displacement (relative to sendbuf) from which to take the outgoing data  
destined for process j

IN sendtype data type of send buffer elements (handle)
OUT recvbuf address of receive buffer (choice)
IN recvcounts	integer array equal to the group size specifying the  
number of elements that can be received from each processor
IN rdispls	integer array (of length group size). Entry i specifies the  
displacement (relative to recvbuf) at which to place the incoming data  
from process i

IN recvtype data type of receive buffer elements (handle)
IN comm communicator (handle)

In particular the wording is "the number of elements that can be  
received from each processor" for recvcounts, and does not say that  
this must be exactly the same as the number of elements sent.


It also mentions that it should work similarly as a number of  
independent MPI_Send/MPI_Recv calls. The amount of data sent in such a  
case does not need to exactly match the amount of data received.


I, unfortunately, missed the following:

The type signature associated with sendcounts[j], sendtypes[j] at  
process i must be equal to the type signature associated with  
recvcounts[i], recvtypes[i] at process j. This implies that the amount  
of data sent must be equal to the amount of data received, pairwise  
between every pair of processes. Distinct type maps between sender and  
receiver are still allowed.


And the openmpi man page shows
  When a pair of processes exchanges data, each may pass   
different  ele-
  ment  count  and datatype arguments so long as the sender  
specifies the
  same amount of data to send (in  bytes)  as  the  receiver   
expects  to

  receive.

I did test my program on different send/recv counts, and while it  
sometimes works, sometimes it does not. Even if it worked I would not  
be comfortable using it anyway.



The question is: If there is no way of determining the length of the  
data sent by the sender on the receiving end, I see two options: Either  
always transmit too much data using MPI_Alltoall(v) or cook up my own  
routine based on PTP calls, probably MPI_Sendrecv is the best option.  
Am I missing something?


--Daniel Spångberg
Materials Chemistry
Uppsala University
Sweden
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Daniel Spångberg
Materialkemi
Uppsala Universitet

[OMPI users] Strange segfault in openmpi

2008-09-19 Thread Daniel Hansen

ser can do to
get more informative error messages?  The user mentioned that this
particular program ran fine before we upgraded to the current openmpi
version, and that he can't find any bugs in his code.

Thanks for your help,

Daniel Hansen
Systems Administrator
BYU Fulton Supercomputing Lab

[OMPI users] segfault issue - possible bug in openmpi

2008-10-03 Thread Daniel Hansen

I have been testing some code against openmpi lately that always causes it
to crash during certain mpi function calls.  The code does not seem to be
the problem, as it runs just fine against mpich.  I have tested it against
openmpi 1.2.5, 1.2.6, and 1.2.7 and they all exhibit the same problem.
Also, the problem only occurs in openmpi when running more than 16
processes.  I have posted this stack trace to the list before, but I am
submitting it now as a potential bug report.  I need some help debugging it
and finding out exactly what is going on in openmpi when the segfault
occurs.  Are there any suggestions on how best to do this?  Is there an easy
way to attach gdb to one of the processes or something??  I have already
compiled openmpi with debugging, memory profiling, etc.  How can I best take
advantage of these features?

Thanks,
Daniel Hansen
Systems Administrator
BYU Fulton Supercomputing Lab

Re: [OMPI users] segfault issue - possible bug in openmpi

2008-10-03 Thread Daniel Hansen

Oh, by the way, here is the segfault:

[m4b-1-8:11481] *** Process received signal ***
[m4b-1-8:11481] Signal: Segmentation fault (11)
[m4b-1-8:11481] Signal code: Address not mapped (1)
[m4b-1-8:11481] Failing at address: 0x2b91c69eed
[m4b-1-8:11483] [ 0] /lib64/libpthread.so.0 [0x33e8c0de70]
[m4b-1-8:11483] [ 1] /fslhome/dhansen7/openmpi/lib/libmpi.so.0
[0x2abea7c0]
[m4b-1-8:11483] [ 2] /fslhome/dhansen7/openmpi/lib/libmpi.so.0
[0x2abea675]
[m4b-1-8:11483] [ 3]
/fslhome/dhansen7/openmpi/lib/libmpi.so.0(mca_pml_ob1_send+0x2da)
[0x2abeaf55]
[m4b-1-8:11483] [ 4]
/fslhome/dhansen7/openmpi/lib/libmpi.so.0(MPI_Send+0x28e) [0x2ab52c5a]
[m4b-1-8:11483] [ 5]
/fslhome/dhansen7/compute/for_DanielHansen/replica_mpi_marylou2/Openmpi_md_twham(twham_init+0x708)
[0x42a8a8]
[m4b-1-8:11483] [ 6]
/fslhome/dhansen7/compute/for_DanielHansen/replica_mpi_marylou2/Openmpi_md_twham(repexch+0x73c)
[0x425d5c]
[m4b-1-8:11483] [ 7]
/fslhome/dhansen7/compute/for_DanielHansen/replica_mpi_marylou2/Openmpi_md_twham(main+0x855)
[0x4133a5]
[m4b-1-8:11483] [ 8] /lib64/libc.so.6(__libc_start_main+0xf4) [0x33e841d8a4]
[m4b-1-8:11483] [ 9]
/fslhome/dhansen7/compute/for_DanielHansen/replica_mpi_marylou2/Openmpi_md_twham
[0x4040b9]
[m4b-1-8:11483] *** End of error message ***



On Fri, Oct 3, 2008 at 3:20 PM, Daniel Hansen  wrote:

> I have been testing some code against openmpi lately that always causes it
> to crash during certain mpi function calls.  The code does not seem to be
> the problem, as it runs just fine against mpich.  I have tested it against
> openmpi 1.2.5, 1.2.6, and 1.2.7 and they all exhibit the same problem.
> Also, the problem only occurs in openmpi when running more than 16
> processes.  I have posted this stack trace to the list before, but I am
> submitting it now as a potential bug report.  I need some help debugging it
> and finding out exactly what is going on in openmpi when the segfault
> occurs.  Are there any suggestions on how best to do this?  Is there an easy
> way to attach gdb to one of the processes or something??  I have already
> compiled openmpi with debugging, memory profiling, etc.  How can I best take
> advantage of these features?
>
> Thanks,
> Daniel Hansen
> Systems Administrator
> BYU Fulton Supercomputing Lab
>

[OMPI users] Disconnections

2009-07-01 Thread Daniel Miles

Hi, everybody.

I¹m having trouble where one of my client nodes crashes while I have an MPI
job on it. When this happens, the mpirun process on the head node never
returns. I can kill it with a SIGINT (ctrl-c) and it still cleans up its
child processes on the remaining healthy client nodes but I don¹t get any of
the results from those client processes.

Does anybody have any ideas about how I could create a more fault-tolerant
MPI job? In an ideal world, my head node would report that it lost the
connection to a client node and keep going as if that client never existed
(so that the results of the job are what they would have been if the
crashed-node wasn¹t part of the job to begin with).

[OMPI users] Very different speed of collective tuned algorithms for alltoallv

2009-08-29 Thread Daniel Spångberg


Dear OpenMPI list,

I noticed a performance problem when increasing the number of CPU's used  
to solve my problem. I traced the problem to the MPI_Alltoallv calls. I  
turns out the default basic linear algorithm is very sensitive to the  
number of CPU's, but the pairwise routine behaves appropriately in my  
case. I have performed tests on 16 processes and 24 processes. I have  
three 8 core nodes (dual intel quadcore 2.5 GHz), connected with GBE for  
these tests. The test sends data (about 12k from each node to every other  
node.) I know alltoallv is not the best choice if the data sizes are the  
same, but this way it reproduces the situation in my original code.


I have set "coll_tuned_use_dynamic_rules=1" in  
$HOME/.openmpi/mca-params.conf


For default runs I used:
time mpirun -np 16 -machinefile hostfile ./testalltoallv
For the basic linear algorithm I used:
time mpirun -np 16 -machinefile hostfile -mca  
coll_tuned_alltoallv_algorithm 1 ./testalltoallv

For the pairwise algorithm I used:
time mpirun -np 16 -machinefile hostfile -mca  
coll_tuned_alltoallv_algorithm 2 ./testalltoallv


For 24 processes I replaced -np 16 with -np 24. The results (runtime in  
seconds):


 -np 16   -np 24
default   2.1  15.6
basic linear  2.1  15.6
pairwise  2.1   2.8

***
A speed difference of almost a factor 6 !!!
***

The test code:

#include 
#include 
#include 

int main(int argc, char **argv)
{
  const int data_size=3000;
  int repeat=100;
  int rank,size;
  int i,j;
  int *sendbuf, *sendcount, *senddispl;
  int *recvbuf, *recvcount, *recvdispl;

  MPI_Init(&argc,&argv);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  MPI_Comm_size(MPI_COMM_WORLD,&size);

  sendbuf=malloc(size * data_size * sizeof *sendbuf);
  recvbuf=malloc(size * data_size * sizeof *recvbuf);
  sendcount=malloc(size * sizeof *sendcount);
  senddispl=malloc(size * sizeof *senddispl);
  recvcount=malloc(size * sizeof *recvcount);
  recvdispl=malloc(size * sizeof *recvdispl);


  /* Set up maximum receive lenghts
 (*sizeof(int) because MPI_BYTE is used later on) */
  for (i=0; iFor me the problem is essentially solved, since I can now change the  
algorithm and get reasonable speed for my problem, but I was somewhat  
surprised about the very large difference in speed, so I wanted to report  
it here, if other users find themselves in a similar situation.


--
Daniel Spångberg
Materialkemi
Uppsala Universitet

[OMPI users] openmpi 1.4 broken -mca coll_tuned_use_dynamic_rules 1

2009-12-30 Thread Daniel Spångberg

openmpi_1.4_test [0x400869]
[girasole:27506] *** End of error message ***
[girasole:27508] *** Process received signal ***
[girasole:27508] Signal: Segmentation fault (11)
[girasole:27508] Signal code:  (128)
[girasole:27508] Failing at address: (nil)
[girasole:27508] [ 0] /lib64/libpthread.so.0 [0x32c780de80]
[girasole:27508] [ 1]  
/home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so  
[0x2b89b09a1eb5]
[girasole:27508] [ 2]  
/home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/openmpi/mca_coll_tuned.so  
[0x2b89b09a08ca]
[girasole:27508] [ 3]  
/home/daniels/src/MISC/openmpi-1.4/openmpi-1.4_install/lib/libmpi.so.0(MPI_Alltoall+0x15f)  
[0x2b89ac711bff]

[girasole:27508] [ 4] ./bug_openmpi_1.4_test(main+0x97) [0x4009b7]
[girasole:27508] [ 5] /lib64/libc.so.6(__libc_start_main+0xf4)  
[0x32c6c1d8b4]

[girasole:27508] [ 6] ./bug_openmpi_1.4_test [0x400869]
[girasole:27508] *** End of error message ***


Best regards,

--
Daniel Spångberg
Materialkemi
Uppsala Universitet

Re: [OMPI users] openmpi 1.4 broken -mca coll_tuned_use_dynamic_rules 1

2009-12-30 Thread Daniel Spångberg

Interesting. I found your issue before I sent my report, but I did not  
realise that this was the same problem. I see now that your example is  
really for openmpi 1.3.4++


Do you know of a work around? I have not used a rule file before and seem  
to be unable to find the documentation for how to use one, unfortunately.


Daniel

Den 2009-12-30 15:17:17 skrev Lenny Verkhovsky  
:



This is the a knowing issue,
https://svn.open-mpi.org/trac/ompi/ticket/2087
Maybe it's priority should be raised up.
Lenny.

Re: [OMPI users] openmpi 1.4 broken -mca coll_tuned_use_dynamic_rules 1

2009-12-30 Thread Daniel Spångberg

Thanks for the help with how to set up the collectives file. I am unable  
to make it work though,


My simple alltoall test is still crashing, although I added even added a  
line specifically for my test commsize of 64 and 100 bytes using bruck.


daniels@kalkyl1:~/.openmpi > cat mca-params.conf
coll_tuned_use_dynamic_rules=1
coll_base_verbose=0
coll_tuned_dynamic_rules_filename="/home/daniels/.openmpi/dynamic_rules_file"
daniels@kalkyl1:~/.openmpi > cat dynamic_rules_file
1 # num of collectives
3 # ID = 3 Alltoall collective (ID in coll_tuned.h)
1 # number of com sizes
64 # comm size 64
3 # number of msg sizes
0 3 0 0 # for message size 0, bruck 1, topo 0, 0 segmentation
100 3 0 0 # for message size 100, bruck 1, topo 0, 0 segmentation
8192 2 0 0 # 8k+, pairwise 2, no topo or segmentation
# end of collective rule

Still it useful to know how to do this, when this issue gets fixed in the  
future!


Daniel



Den 2009-12-30 15:57:50 skrev Lenny Verkhovsky  
:



The only workaround that I found is a file with dynamic rules.
This is an example that George sent me once. It helped for me, until it  
will

be fixed.

" Lenny,

You asked for dynamic rules but it looks like you didn't provide them.
Dynamic rules allow the user to specify which algorithm to be used for  
each
collective based on a set of rules. I corrected the current behavior, so  
it
will not crash. However, as you didn't provide dynamic rules, it will  
just

switch back to default behavior (i.e. ignore the
coll_tuned_use_dynamic_rules MCA parameter).

As an example, here is a set of dynamic rules. I added some comment to
clarify it, but if you have any questions please ask.

2 # num of collectives
3 # ID = 3 Alltoall collective (ID in coll_tuned.h)
1 # number of com sizes
64 # comm size 64
2 # number of msg sizes
0 3 0 0 # for message size 0, bruck 1, topo 0, 0 segmentation
8192 2 0 0 # 8k+, pairwise 2, no topo or segmentation
# end of collective rule
#
2 # ID = 2 Allreduce collective (ID in coll_tuned.h)
1 # number of com sizes
1 # comm size 2
2 # number of msg sizes
0 1 0 0 # for message size 0, basic linear 1, topo 0, 0 segmentation
1024 2 0 0 # for messages size > 1024, nonoverlapping 2, topo 0, 0
segmentation
# end of collective rule
#

And here is what I have in my $(HOME)/.openmpi/mca-params.conf to  
activate

them:
#
# Dealing with collective
#
coll_base_verbose = 0

coll_tuned_use_dynamic_rules = 1
coll_tuned_dynamic_rules_filename = **the name of the file where you  
saved

the rules **

"

On Wed, Dec 30, 2009 at 4:44 PM, Daniel Spångberg  
wrote:



Interesting. I found your issue before I sent my report, but I did not
realise that this was the same problem. I see now that your example is
really for openmpi 1.3.4++

Do you know of a work around? I have not used a rule file before and  
seem
to be unable to find the documentation for how to use one,  
unfortunately.


Daniel

Den 2009-12-30 15:17:17 skrev Lenny Verkhovsky  

>:


 This is the a knowing issue,

   https://svn.open-mpi.org/trac/ompi/ticket/2087
Maybe it's priority should be raised up.
Lenny.


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Daniel Spångberg
Materialkemi
Uppsala Universitet

Re: [OMPI users] openmpi 1.4 broken -mca coll_tuned_use_dynamic_rules 1

2009-12-30 Thread Daniel Spångberg

That works! Many thanks!

Daniel

Den 2009-12-30 16:44:52 skrev Lenny Verkhovsky  
:

it may crash if it doesnt see a file with rules.
try providing it through the command line
$mpirun -mca coll_tuned_use_dynamic_rules 1 -mca
coll_tuned_dynamic_rules_filename full_path_to_file_  .

On Wed, Dec 30, 2009 at 5:35 PM, Daniel Spångberg  
wrote:

Thanks for the help with how to set up the collectives file. I am  
unable to

make it work though,

My simple alltoall test is still crashing, although I added even added a
line specifically for my test commsize of 64 and 100 bytes using bruck.

daniels@kalkyl1:~/.openmpi > cat mca-params.conf

coll_tuned_use_dynamic_rules=1
coll_base_verbose=0

coll_tuned_dynamic_rules_filename="/home/daniels/.openmpi/dynamic_rules_file"
daniels@kalkyl1:~/.openmpi > cat dynamic_rules_file

1 # num of collectives
3 # ID = 3 Alltoall collective (ID in coll_tuned.h)
1 # number of com sizes
64 # comm size 64
3 # number of msg sizes
0 3 0 0 # for message size 0, bruck 1, topo 0, 0 segmentation
100 3 0 0 # for message size 100, bruck 1, topo 0, 0 segmentation

8192 2 0 0 # 8k+, pairwise 2, no topo or segmentation
# end of collective rule

Still it useful to know how to do this, when this issue gets fixed in  
the

future!

Daniel

Den 2009-12-30 15:57:50 skrev Lenny Verkhovsky  

>:

 The only workaround that I found is a file with dynamic rules.

This is an example that George sent me once. It helped for me, until it
will
be fixed.

" Lenny,

You asked for dynamic rules but it looks like you didn't provide them.
Dynamic rules allow the user to specify which algorithm to be used for
each
collective based on a set of rules. I corrected the current behavior,  
so

it
will not crash. However, as you didn't provide dynamic rules, it will  
just

switch back to default behavior (i.e. ignore the
coll_tuned_use_dynamic_rules MCA parameter).

As an example, here is a set of dynamic rules. I added some comment to
clarify it, but if you have any questions please ask.

2 # num of collectives
3 # ID = 3 Alltoall collective (ID in coll_tuned.h)
1 # number of com sizes
64 # comm size 64
2 # number of msg sizes
0 3 0 0 # for message size 0, bruck 1, topo 0, 0 segmentation
8192 2 0 0 # 8k+, pairwise 2, no topo or segmentation
# end of collective rule
#
2 # ID = 2 Allreduce collective (ID in coll_tuned.h)
1 # number of com sizes
1 # comm size 2
2 # number of msg sizes
0 1 0 0 # for message size 0, basic linear 1, topo 0, 0 segmentation
1024 2 0 0 # for messages size > 1024, nonoverlapping 2, topo 0, 0
segmentation
# end of collective rule
#

And here is what I have in my $(HOME)/.openmpi/mca-params.conf to  
activate

them:
#
# Dealing with collective
#
coll_base_verbose = 0

coll_tuned_use_dynamic_rules = 1
coll_tuned_dynamic_rules_filename = **the name of the file where you  
saved

the rules **

"

On Wed, Dec 30, 2009 at 4:44 PM, Daniel Spångberg wrote:

 Interesting. I found your issue before I sent my report, but I did not

realise that this was the same problem. I see now that your example is
really for openmpi 1.3.4++

Do you know of a work around? I have not used a rule file before and  
seem
to be unable to find the documentation for how to use one,  
unfortunately.

Daniel

Den 2009-12-30 15:17:17 skrev Lenny Verkhovsky <
lenny.verkhov...@gmail.com
>:

 This is the a knowing issue,

  https://svn.open-mpi.org/trac/ompi/ticket/2087
Maybe it's priority should be raised up.
Lenny.

 ___

users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Daniel Spångberg
Materialkemi
Uppsala Universitet
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Daniel Spångberg
Materialkemi
Uppsala Universitet

Re: [OMPI users] dynamic rules

2010-01-15 Thread Daniel Spångberg

I have done this according to suggestion on this list, until a fix comes  
that makes it possible to change via command line:


To choose bruck for all message sizes / mpi sizes with openmpi-1.4

File $HOME/.openmpi/mca-params.conf (replace /homeX) so it points to  
the correct file:

coll_tuned_use_dynamic_rules=1
coll_tuned_dynamic_rules_filename="/home/.openmpi/dynamic_rules_file"

file $HOME/.openmpi/dynamic_rules_file:
1 # num of collectives
3 # ID = 3 Alltoall collective (ID in coll_tuned.h)
1 # number of com sizes
0 # comm size
1 # number of msg sizes
0 3 0 0 # for message size 0, bruck, topo 0, 0 segmentation
# end of collective rule

Change the number 3 to something else for other algoritms (can be found  
with ompi_info -a for example):


   MCA coll: information "coll_tuned_alltoall_algorithm_count" (value: "4")
  Number of alltoall algorithms available
MCA coll: parameter "coll_tuned_alltoall_algorithm"  
(current value: "0")
  Which alltoall algorithm is used. Can be locked  
down to choice of: 0 ignore, 1 basic linear, 2 pairwise, 3: modified  
bruck, 4: two proc only.


HTH
Daniel Spångberg



Den 2010-01-15 13:54:33 skrev Roman Martonak :

On my machine I need to use dynamic rules to enforce the bruck or  
pairwise

algorithm for alltoall, since unfortunately the default basic linear
algorithm performs quite poorly on my
Infiniband network. Few months ago I noticed that in case of VASP,
however, the use of dynamic
rules via --mca coll_tuned_use_dynamic_rules 1 -mca
coll_tuned_dynamic_rules_filename dyn_rules
has no effect at all. Later it was identified that there was a bug
causing the dynamic rules to
apply only to the MPI_COMM_WORLD but not to other communicators. As
far as I understand, the bug
was fixed in openmpi-1.3.4. I tried now the openmpi-1.4 version and
expected that tuning of alltoall via dynamic
rules would work, but there is still no effect at all. Even worse, now
it is not even possible to use static rules
(which worked previously) such as -mca coll_tuned_alltoall_algorithm
3, because the code would crash (as discussed in the list recently).
When running with --mca coll_base_verbose 1000, I get messages like

[compute-0-0.local:08011] coll:sm:comm_query (0/MPI_COMM_WORLD):
intercomm, comm is too small, or not all peers local; disqualifying
myself
[compute-0-0.local:08011] coll:base:comm_select: component not  
available: sm

[compute-0-0.local:08011] coll:base:comm_select: component available:
sync, priority: 50
[compute-0-3.local:26116] coll:base:comm_select: component available:
self, priority: 75
[compute-0-3.local:26116] coll:sm:comm_query (1/MPI_COMM_SELF):
intercomm, comm is too small, or not all peers local; disqualifying
myself
[compute-0-3.local:26116] coll:base:comm_select: component not  
available: sm

[compute-0-3.local:26116] coll:base:comm_select: component available:
sync, priority: 50
[compute-0-3.local:26116] coll:base:comm_select: component not  
available: tuned

[compute-0-0.local:08011] coll:base:comm_select: component available:
tuned, priority: 30

Is there now a way to use other alltoall algorithms instead of the
basic linear algorithm in openmpi-1.4.x ?

Thanks in advance for any suggestion.

Best regards

Roman Martonak
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Daniel Spångberg
Materialkemi
Uppsala Universitet

Re: [OMPI users] dynamic rules

2010-01-15 Thread Daniel Spångberg


I tried this and it still crashes with openmpi-1.4. Is it supposed to
work with openmpi-1.4
or do I need to compile openmpi-1.4.1 ?



Terribly sorry, I should checked my own notes thoroughly before giving  
others advice. One needs to give the dynamic rules file location on the  
command line:


mpirun -mca coll_tuned_use_dynamic_rules 1 -mca  
coll_tuned_dynamic_rules_filename /home/.openmpi/dynamic_rules_file


That works for me with openmpi 1.4. I have not tried 1.4.1 yet.

Daniel

Re: [OMPI users] dynamic rules

2010-01-20 Thread Daniel Spångberg


Den 2010-01-16 16:31:24 skrev Roman Martonak :


Terribly sorry, I should checked my own notes thoroughly before giving
others advice. One needs to give the dynamic rules file location on the
command line:

mpirun -mca coll_tuned_use_dynamic_rules 1 -mca
coll_tuned_dynamic_rules_filename /home/.openmpi/dynamic_rules_file

That works for me with openmpi 1.4. I have not tried 1.4.1 yet.


Thanks, I will try it. VASP uses cartesian topology communicators.
Should the dynamic
rules work also for this case in openmpi-1.4 ? In openmpi-1.3.2 and
previous versions
the dynamic rules specified via a dynamic rules file had no effect at
all for VASP.


I just tried alltoall with a communicator I created that uses half the  
slots with 256 bytes message size. The fixed rules uses bruck for messages  
smaller than 200 bytes on (I think) 12 processes and up. So my test should  
never use bruck. On 512 cores (using 256 for the comm) the fixed rules  
take about 10 ms / alltoall. Using the dynamic rules file forcing bruck  
makes it take about 1 ms / alltoall, so 10 times quicker. So, yes it seems  
that other communicators than MPI_COMM_WORLD are affected by the dynamics  
rule file.


My communicator:

  MPI_Comm_group(MPI_COMM_WORLD,&world_group);
  MPI_Comm_size(MPI_COMM_WORLD,&size);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  ranges[0][0]=0;
  ranges[0][1]=size-2;
  ranges[0][2]=2;
  MPI_Group_range_incl(world_group,1,ranges,&half_group);
  MPI_Comm_create(MPI_COMM_WORLD,half_group,&half_comm);

HTH
--
Daniel Spångberg
Materialkemi
Uppsala Universitet

[OMPI users] Ok, I've got OpenMPI set up, now what?!

2010-07-17 Thread Daniel Janzon

Dear OpenMPI Users,

I successfully installed OpenMPI on some FreeBSD machines and I can
run MPI programs on the cluster. Yippie!

But I'm not patient enough to write my own MPI-based routines. So I
thought maybe I could ask here for suggestions. I am primarily
interested in general linear algebra routines. The best would be to
for instance start Octave and just use it as normal, only that all
matrix operations would run on the cluster. Has anyone done that? The
octave-parallel package seems to be something different.

I installed scalapack and the test files ran successfully with mpirun
(except a few of them). But the source code examples of scalapack
looks terrible. Is there no higher-level library that provides an API
with matrix operations, which have all MPI parallelism stuff handled
for you in the background? Certainly a smart piece of software can
decide better than me how to chunk up a matrix and pass it out to the
available processes.

All the best,
Daniel

Re: [OMPI users] Ok, I've got OpenMPI set up, now what?!

2010-07-19 Thread Daniel Janzon

Thanks a lot! PETSc seems to be really solid and integrates with MUMPS
suggested by Damien.

All the best,
Daniel Janzon

On 7/18/10, Gustavo Correa  wrote:
> Check PETSc:
> http://www.mcs.anl.gov/petsc/petsc-as/
>
> On Jul 18, 2010, at 12:37 AM, Damien wrote:
>
>> You should check out the MUMPS parallel linear solver.
>>
>> Damien
>> Sent from my iPhone
>>
>> On 2010-07-17, at 5:16 PM, Daniel Janzon  wrote:
>>
>>> Dear OpenMPI Users,
>>>
>>> I successfully installed OpenMPI on some FreeBSD machines and I can
>>> run MPI programs on the cluster. Yippie!
>>>
>>> But I'm not patient enough to write my own MPI-based routines. So I
>>> thought maybe I could ask here for suggestions. I am primarily
>>> interested in general linear algebra routines. The best would be to
>>> for instance start Octave and just use it as normal, only that all
>>> matrix operations would run on the cluster. Has anyone done that? The
>>> octave-parallel package seems to be something different.
>>>
>>> I installed scalapack and the test files ran successfully with mpirun
>>> (except a few of them). But the source code examples of scalapack
>>> looks terrible. Is there no higher-level library that provides an API
>>> with matrix operations, which have all MPI parallelism stuff handled
>>> for you in the background? Certainly a smart piece of software can
>>> decide better than me how to chunk up a matrix and pass it out to the
>>> available processes.
>>>
>>> All the best,
>>> Daniel
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

[OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-18 Thread Daniel Letai


given a simple hello.c:

#include 
#include 

int main(int argc, char* argv[])
{
int size, rank, len;
char name[MPI_MAX_PROCESSOR_NAME];

MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(name, &len);

printf("%s: Process %d out of %d\n", name, rank, size);

MPI_Finalize();
}

for n=1
mpirun -n 1 ./hello
it works correctly.

for n>1 it segfaults with signal 11
used gdb to trace the problem to ml coll:

Program received signal SIGSEGV, Segmentation fault.
0x76750845 in ml_coll_hier_barrier_setup()
from /lib/openmpi/mca_coll_ml.so

running with
mpirun -n 2 --mca coll ^ml ./hello
works correctly

using mellanox ofed 2.3-2.0.5-rhel6.4-x86_64, if it's at all relevant.
openmpi 1.8.5 was built with following options:
rpmbuild --rebuild --define 'configure_options --with-verbs=/usr 
--with-verbs-libdir=/usr/lib64 CC=gcc CXX=g++ FC=gfortran CFLAGS="-g 
-O3" --enable-mpirun-prefix-by-default 
--enable-orterun-prefix-by-default --disable-debug 
--with-knem=/opt/knem-1.1.1.90mlnx --with-platform=optimized 
--without-mpi-param-check --with-contrib-vt-flags=--disable-iotrace 
--enable-builtin-atomics --enable-cxx-exceptions --enable-sparse-groups 
--enable-mpi-thread-multiple --enable-memchecker 
--enable-btl-openib-failover --with-hwloc=internal --with-verbs --with-x 
--with-slurm --with-pmi=/opt/slurm --with-fca=/opt/mellanox/fca 
--with-mxm=/opt/mellanox/mxm --with-hcoll=/opt/mellanox/hcoll' 
openmpi-1.8.5-1.src.rpm


gcc version 5.1.1

Thanks in advance

Re: [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-18 Thread Daniel Letai


No, that's the issue.
I had to disable it to get things working.

That's why I included my config settings - I couldn't figure out which 
option enabled it, so I could remove it from the configuration...


On 06/18/2015 02:43 PM, Gilles Gouaillardet wrote:

Daniel,

ML module is not ready for production and is disabled by default.

Did you explicitly enable this module ?
If yes, I encourage you to disable it

Cheers,

Gilles

On Thursday, June 18, 2015, Daniel Letai <mailto:d...@letai.org.il>> wrote:


given a simple hello.c:

#include 
#include 

int main(int argc, char* argv[])
{
int size, rank, len;
char name[MPI_MAX_PROCESSOR_NAME];

MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(name, &len);

printf("%s: Process %d out of %d\n", name, rank, size);

MPI_Finalize();
}

for n=1
mpirun -n 1 ./hello
it works correctly.

for n>1 it segfaults with signal 11
used gdb to trace the problem to ml coll:

Program received signal SIGSEGV, Segmentation fault.
0x76750845 in ml_coll_hier_barrier_setup()
from /lib/openmpi/mca_coll_ml.so

running with
mpirun -n 2 --mca coll ^ml ./hello
works correctly

using mellanox ofed 2.3-2.0.5-rhel6.4-x86_64, if it's at all relevant.
openmpi 1.8.5 was built with following options:
rpmbuild --rebuild --define 'configure_options --with-verbs=/usr
--with-verbs-libdir=/usr/lib64 CC=gcc CXX=g++ FC=gfortran
CFLAGS="-g -O3" --enable-mpirun-prefix-by-default
--enable-orterun-prefix-by-default --disable-debug
--with-knem=/opt/knem-1.1.1.90mlnx --with-platform=optimized
--without-mpi-param-check
--with-contrib-vt-flags=--disable-iotrace --enable-builtin-atomics
--enable-cxx-exceptions --enable-sparse-groups
--enable-mpi-thread-multiple --enable-memchecker
--enable-btl-openib-failover --with-hwloc=internal --with-verbs
--with-x --with-slurm --with-pmi=/opt/slurm
--with-fca=/opt/mellanox/fca --with-mxm=/opt/mellanox/mxm
--with-hcoll=/opt/mellanox/hcoll' openmpi-1.8.5-1.src.rpm

gcc version 5.1.1

Thanks in advance
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/06/27154.php



___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/06/27155.php

Re: [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-18 Thread Daniel Letai


Thanks, will try it on Sunday (won't have access to the system till then)

On 06/18/2015 04:36 PM, Gilles Gouaillardet wrote:

This is really odd...

you can run
ompi_info --all
and search coll_ml_priority

it will display the current value and the origin
(e.g. default, system wide config, user config, cli, environment variable)

Cheers,

Gilles

On Thursday, June 18, 2015, Daniel Letai <mailto:d...@letai.org.il>> wrote:


No, that's the issue.
I had to disable it to get things working.

That's why I included my config settings - I couldn't figure out
which option enabled it, so I could remove it from the
configuration...

On 06/18/2015 02:43 PM, Gilles Gouaillardet wrote:

Daniel,

ML module is not ready for production and is disabled by default.

Did you explicitly enable this module ?
If yes, I encourage you to disable it

Cheers,

Gilles

    On Thursday, June 18, 2015, Daniel Letai > wrote:

given a simple hello.c:

#include 
#include 

int main(int argc, char* argv[])
{
int size, rank, len;
char name[MPI_MAX_PROCESSOR_NAME];

MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(name, &len);

printf("%s: Process %d out of %d\n", name, rank, size);

MPI_Finalize();
}

for n=1
mpirun -n 1 ./hello
it works correctly.

for n>1 it segfaults with signal 11
used gdb to trace the problem to ml coll:

Program received signal SIGSEGV, Segmentation fault.
0x76750845 in ml_coll_hier_barrier_setup()
from /lib/openmpi/mca_coll_ml.so

running with
mpirun -n 2 --mca coll ^ml ./hello
works correctly

using mellanox ofed 2.3-2.0.5-rhel6.4-x86_64, if it's at all
relevant.
openmpi 1.8.5 was built with following options:
rpmbuild --rebuild --define 'configure_options
--with-verbs=/usr --with-verbs-libdir=/usr/lib64 CC=gcc
CXX=g++ FC=gfortran CFLAGS="-g -O3"
--enable-mpirun-prefix-by-default
--enable-orterun-prefix-by-default --disable-debug
--with-knem=/opt/knem-1.1.1.90mlnx --with-platform=optimized
--without-mpi-param-check
--with-contrib-vt-flags=--disable-iotrace
--enable-builtin-atomics --enable-cxx-exceptions
--enable-sparse-groups --enable-mpi-thread-multiple
--enable-memchecker --enable-btl-openib-failover
--with-hwloc=internal --with-verbs --with-x --with-slurm
--with-pmi=/opt/slurm --with-fca=/opt/mellanox/fca
--with-mxm=/opt/mellanox/mxm
--with-hcoll=/opt/mellanox/hcoll' openmpi-1.8.5-1.src.rpm

gcc version 5.1.1

Thanks in advance
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/06/27154.php



___
users mailing list
us...@open-mpi.org  
Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this 
post:http://www.open-mpi.org/community/lists/users/2015/06/27155.php




___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/06/27157.php

Re: [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-21 Thread Daniel Letai

MCA coll: parameter "coll_ml_priority" (current value: "0", data source: 
default, level: 9 dev/all, type: int)


Not sure how to read this, but for any n>1 mpirun only works with --mca 
coll ^ml


Thanks for helping

On 06/18/2015 04:36 PM, Gilles Gouaillardet wrote:

This is really odd...

you can run
ompi_info --all
and search coll_ml_priority

it will display the current value and the origin
(e.g. default, system wide config, user config, cli, environment variable)

Cheers,

Gilles

On Thursday, June 18, 2015, Daniel Letai <mailto:d...@letai.org.il>> wrote:


No, that's the issue.
I had to disable it to get things working.

That's why I included my config settings - I couldn't figure out
which option enabled it, so I could remove it from the
configuration...

On 06/18/2015 02:43 PM, Gilles Gouaillardet wrote:

Daniel,

ML module is not ready for production and is disabled by default.

Did you explicitly enable this module ?
If yes, I encourage you to disable it

Cheers,

    Gilles

On Thursday, June 18, 2015, Daniel Letai > wrote:

given a simple hello.c:

#include 
#include 

int main(int argc, char* argv[])
{
int size, rank, len;
char name[MPI_MAX_PROCESSOR_NAME];

MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(name, &len);

printf("%s: Process %d out of %d\n", name, rank, size);

MPI_Finalize();
}

for n=1
mpirun -n 1 ./hello
it works correctly.

for n>1 it segfaults with signal 11
used gdb to trace the problem to ml coll:

Program received signal SIGSEGV, Segmentation fault.
0x76750845 in ml_coll_hier_barrier_setup()
from /lib/openmpi/mca_coll_ml.so

running with
mpirun -n 2 --mca coll ^ml ./hello
works correctly

using mellanox ofed 2.3-2.0.5-rhel6.4-x86_64, if it's at all
relevant.
openmpi 1.8.5 was built with following options:
rpmbuild --rebuild --define 'configure_options
--with-verbs=/usr --with-verbs-libdir=/usr/lib64 CC=gcc
CXX=g++ FC=gfortran CFLAGS="-g -O3"
--enable-mpirun-prefix-by-default
--enable-orterun-prefix-by-default --disable-debug
--with-knem=/opt/knem-1.1.1.90mlnx --with-platform=optimized
--without-mpi-param-check
--with-contrib-vt-flags=--disable-iotrace
--enable-builtin-atomics --enable-cxx-exceptions
--enable-sparse-groups --enable-mpi-thread-multiple
--enable-memchecker --enable-btl-openib-failover
--with-hwloc=internal --with-verbs --with-x --with-slurm
--with-pmi=/opt/slurm --with-fca=/opt/mellanox/fca
--with-mxm=/opt/mellanox/mxm
--with-hcoll=/opt/mellanox/hcoll' openmpi-1.8.5-1.src.rpm

gcc version 5.1.1

Thanks in advance
___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/06/27154.php



___
users mailing list
us...@open-mpi.org  
Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this 
post:http://www.open-mpi.org/community/lists/users/2015/06/27155.php




___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/06/27157.php

Re: [OMPI users] simple mpi hello world segfaults when coll ml not disabled

2015-06-24 Thread Daniel Letai


Gilles,

Attached the two output logs.

Thanks,
Daniel

On 06/22/2015 08:08 AM, Gilles Gouaillardet wrote:

Daniel,

i double checked this and i cannot make any sense with these logs.

if coll_ml_priority is zero, then i do not any way how 
ml_coll_hier_barrier_setup can be invoked.


could you please run again with --mca coll_base_verbose 100
with and without --mca coll ^ml

Cheers,

Gilles

On 6/22/2015 12:08 AM, Gilles Gouaillardet wrote:

Daniel,

ok, thanks

it seems that even if priority is zero, some code gets executed
I will confirm this tomorrow and send you a patch to work around the 
issue if that if my guess is proven right


Cheers,

Gilles

On Sunday, June 21, 2015, Daniel Letai <mailto:d...@letai.org.il>> wrote:


MCA coll: parameter "coll_ml_priority" (current value: "0", data
source: default, level: 9 dev/all, type: int)

Not sure how to read this, but for any n>1 mpirun only works with
--mca coll ^ml

Thanks for helping

On 06/18/2015 04:36 PM, Gilles Gouaillardet wrote:

This is really odd...

you can run
ompi_info --all
and search coll_ml_priority

it will display the current value and the origin
(e.g. default, system wide config, user config, cli, environment
variable)

Cheers,

Gilles

On Thursday, June 18, 2015, Daniel Letai > wrote:

No, that's the issue.
I had to disable it to get things working.

That's why I included my config settings - I couldn't figure
out which option enabled it, so I could remove it from the
configuration...

On 06/18/2015 02:43 PM, Gilles Gouaillardet wrote:

Daniel,

ML module is not ready for production and is disabled by
default.

Did you explicitly enable this module ?
If yes, I encourage you to disable it

Cheers,

    Gilles

On Thursday, June 18, 2015, Daniel Letai
 wrote:

given a simple hello.c:

#include 
#include 

int main(int argc, char* argv[])
{
int size, rank, len;
char name[MPI_MAX_PROCESSOR_NAME];

MPI_Init(&argc, &argv);
MPI_Comm_size(MPI_COMM_WORLD, &size);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Get_processor_name(name, &len);

printf("%s: Process %d out of %d\n", name,
rank, size);

MPI_Finalize();
}

for n=1
mpirun -n 1 ./hello
it works correctly.

for n>1 it segfaults with signal 11
used gdb to trace the problem to ml coll:

Program received signal SIGSEGV, Segmentation fault.
0x76750845 in ml_coll_hier_barrier_setup()
from /lib/openmpi/mca_coll_ml.so

running with
mpirun -n 2 --mca coll ^ml ./hello
works correctly

using mellanox ofed 2.3-2.0.5-rhel6.4-x86_64, if it's
at all relevant.
openmpi 1.8.5 was built with following options:
rpmbuild --rebuild --define 'configure_options
--with-verbs=/usr --with-verbs-libdir=/usr/lib64 CC=gcc
CXX=g++ FC=gfortran CFLAGS="-g -O3"
--enable-mpirun-prefix-by-default
--enable-orterun-prefix-by-default --disable-debug
--with-knem=/opt/knem-1.1.1.90mlnx
--with-platform=optimized --without-mpi-param-check
--with-contrib-vt-flags=--disable-iotrace
--enable-builtin-atomics --enable-cxx-exceptions
--enable-sparse-groups --enable-mpi-thread-multiple
--enable-memchecker --enable-btl-openib-failover
--with-hwloc=internal --with-verbs --with-x
--with-slurm --with-pmi=/opt/slurm
--with-fca=/opt/mellanox/fca
--with-mxm=/opt/mellanox/mxm
--with-hcoll=/opt/mellanox/hcoll' openmpi-1.8.5-1.src.rpm

gcc version 5.1.1

Thanks in advance
___
users mailing list
us...@open-mpi.org
Subscription:
http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/06/27154.php



___
users mailing list
us...@open-mpi.org
Subscription:http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this 
post:http://www.open-mpi.org/community/lists/users/2015/06/27155.php




___
users mailing list
us...@open-mpi.org  
Subscription:http://www.open-mpi.org/mailm

[OMPI users] display-map option in v1.8.8

2015-10-12 Thread Daniel Letai


Hi,
After upgrading to 1.8.8 I can no longer see the map. When looking at 
the man page for mpirun, display-map no longer exists. Is there a way to 
show the map in 1.8.8 ?


Another issue - I'd like to map 2 process per node - 1 to each socket.
What is the current "correct" syntax? --map-by ppr:2:node doesn't 
guarantee 1 per Socket. --map-by ppr:1:socket doesn't guarantee 2 per 
node. I assume it's something obvious, but the documentation is somewhat 
lacking.
I'd like to know the general syntax - even if I have 4 socket nodes I'd 
still like to map only 2 procs per node. Combining with numa/dist to 
hca/dist to gpu will be very helpful too.


Thanks,

Re: [OMPI users] display-map option in v1.8.8

2015-10-20 Thread Daniel Letai


Thanks for the reply,

On 10/13/2015 04:04 PM, Ralph Castain wrote:

On Oct 12, 2015, at 6:10 AM, Daniel Letai  wrote:

Hi,
After upgrading to 1.8.8 I can no longer see the map. When looking at the man 
page for mpirun, display-map no longer exists. Is there a way to show the map 
in 1.8.8 ?

I don’t know why/how it got dropped from the man page, but the display-map 
option certainly still exists - do “mpirun -h” to see the full list of options, 
and you’ll see it is there. I’ll ensure it gets restored to the man page in the 
1.10 series as the 1.8 series is complete.

Thanks for clarifying,



Another issue - I'd like to map 2 process per node - 1 to each socket.
What is the current "correct" syntax? --map-by ppr:2:node doesn't guarantee 1 
per Socket. --map-by ppr:1:socket doesn't guarantee 2 per node. I assume it's something 
obvious, but the documentation is somewhat lacking.
I'd like to know the general syntax - even if I have 4 socket nodes I'd still 
like to map only 2 procs per node.

That’s a tough one. I’m not sure there is a way to do that right now. Probably 
something we’d have to add. Out of curiosity, if you have 4 sockets and only 2 
procs, would you want each proc bound to 2 of the 4 sockets? Or are you 
expecting them to be bound to only 1 socket (thus leaving 2 sockets idle), or 
simply leave them unbound?
I have 2 pci devices (gpu) per node. I need 1 proc per socket to be 
bound to that socket and "talk" to it's respective gpu, so no matter how 
many sockets I have - I must distribute the procs 2 per node, each in 
it's own socket (actually, each in it's own numa) and  be bound.


So I expect them to be "bound to only 1 socket (thus leaving 2 sockets 
idle)".


I might run other jobs on the idle sockets (depending on mem 
utilization) but that's not an immediate concern at this time.



Combining with numa/dist to hca/dist to gpu will be very helpful too.

Definitely no way to do this one today.


Thanks,


___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/10/27860.php

___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/10/27861.php

Re: [OMPI users] display-map option in v1.8.8

2015-10-21 Thread Daniel Letai




On 10/20/2015 04:14 PM, Ralph Castain wrote:


On Oct 20, 2015, at 5:47 AM, Daniel Letai <mailto:d...@letai.org.il>> wrote:


Thanks for the reply,

On 10/13/2015 04:04 PM, Ralph Castain wrote:
On Oct 12, 2015, at 6:10 AM, Daniel Letai <mailto:d...@letai.org.il>> wrote:


Hi,
After upgrading to 1.8.8 I can no longer see the map. When looking 
at the man page for mpirun, display-map no longer exists. Is there 
a way to show the map in 1.8.8 ?
I don’t know why/how it got dropped from the man page, but the 
display-map option certainly still exists - do “mpirun -h” to see 
the full list of options, and you’ll see it is there. I’ll ensure it 
gets restored to the man page in the 1.10 series as the 1.8 series 
is complete.

Thanks for clarifying,



Another issue - I'd like to map 2 process per node - 1 to each socket.
What is the current "correct" syntax? --map-by ppr:2:node doesn't 
guarantee 1 per Socket. --map-by ppr:1:socket doesn't guarantee 2 
per node. I assume it's something obvious, but the documentation is 
somewhat lacking.
I'd like to know the general syntax - even if I have 4 socket nodes 
I'd still like to map only 2 procs per node.
That’s a tough one. I’m not sure there is a way to do that right 
now. Probably something we’d have to add. Out of curiosity, if you 
have 4 sockets and only 2 procs, would you want each proc bound to 2 
of the 4 sockets? Or are you expecting them to be bound to only 1 
socket (thus leaving 2 sockets idle), or simply leave them unbound?
I have 2 pci devices (gpu) per node. I need 1 proc per socket to be 
bound to that socket and "talk" to it's respective gpu, so no matter 
how many sockets I have - I must distribute the procs 2 per node, 
each in it's own socket (actually, each in it's own numa) and  be bound.


So I expect them to be "bound to only 1 socket (thus leaving 2 
sockets idle)”.


Are the gpu’s always near the same sockets for every node? If so, you 
might be able to use the cpu-set option to restrict us to those 
sockets, and then just "—map-by ppr:2:node —bind-to socket"


 -cpu-set|--cpu-set 
 Comma-separated list of ranges specifying logical
 cpus allocated to this job [default: none]



I Believe this should solve the issue. So the cmdline should be 
something like:

mpirun --map-by ppr:2:node --bind-to socket --cpu-set 0,2
BTW --cpu-set also absent from man page.



I might run other jobs on the idle sockets (depending on mem 
utilization) but that's not an immediate concern at this time.



Combining with numa/dist to hca/dist to gpu will be very helpful too.

Definitely no way to do this one today.


Thanks,


___
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/10/27860.php

___
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/10/27861.php


___
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/10/27898.php




___
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2015/10/27899.php

[OMPI users] It's possible to get mpi working without ssh?

2018-12-19 Thread Daniel Edreira

Hi all,

Does anyone know if there's a possibility to configure a cluster of nodes to 
communicate with each other with mpirun without using SSH?

Someone is asking me about making a cluster with Infiniband that does not use 
SSH to communicate using OpenMPI.

Thanks in advance

Regards.

AVISO LEGAL

Responsable del tratamiento: PLAIN CONCEPTS, S.L con CIF B24532178 con 
dirección AVDA. ORDOÑO II 32 1° IZDA - 24001 LEÓN con 
r...@plainconcepts.com
Finalidades: para gestionar y mantener los contactos y relaciones que se 
produzcan como consecuencia de la relación que mantiene con PLAIN CONCEPTS, S.L.
Legitimación a su consentimiento, así como para la ejecución de un contrato, 
procediendo éstos del propio interesado titular de los datos.
Destinatarios: sus datos no serán cedidos a ninguna empresa, salvo obligación 
legal. Sin embargo, para tener la máxima transparencia con usted, le indicamos 
que nuestro servicio de correo electrónico está externalizado con Office365 
(propiedad de Microsoft Corporation), pudiendo suponer una transferencia 
internacional de datos personales fuera del Espacio Económico Europeo. No 
obstante, Office365 aporta garantías suficientes al estar adherida al marco del 
Privacy Shield (más información pulsando 
aquí).
Derechos: puede acceder, rectificar y suprimir los datos, así como otros 
derechos, como se explica en la información adicional.
Confidencialidad: la información contenida en el presente correo electrónico es 
confidencial; en el supuesto de que usted no sea el destinatario autorizado, le 
rogamos borre el mensaje y nos lo comunique a la presente dirección de correo 
electrónico.
Información adicional: puede consultar la información adicional y detallada 
sobre protección de Datos en nuestra página web: 
www.plainconcepts.com/privacy/

LEGAL NOTICE

Responsible for the treatment: PLAIN CONCEPTS, S.L with CIF B24532178 and 
address AVDA. ORDOÑO II 32 1° IZDA - 24001 LEON with 
r...@plainconcepts.com
Purpose: to manage and maintain contacts and relationships that occur as a 
result of the relationship maintained with PLAIN CONCEPTS, S.L.
Legitimation to their consent, as well as for the execution of a contract, 
coming from the interested party owning the data.
Recipients: your data will not be transferred to any company, except legal 
obligation. However, in order to have maximum transparency with you, we inform 
you that our email service is outsourced with Office365 (owned by Microsoft 
Corporation), and may involve an international transfer of personal data 
outside the European Economic Area. However, Office365 provides sufficient 
guarantees by adhering to the Privacy Shield link (more information by clicking 
here).
Rights: you can access, rectify and delete the data, as well as other rights, 
as explained in the additional information.
Confidentiality: the information contained in this email is confidential; In 
the event that you are not the authorized recipient, please delete the message 
and notify us to this email address.
Additional information: you can consult the additional and detailed information 
on data protection on our website: 
www.plainconcepts.com/privacy/

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] It's possible to get mpi working without ssh?

2018-12-19 Thread Daniel Edreira

Hi Jeff,

Thanks a lot for your answer, It really makes sense.

Do you know where can I find more information about the *process launch on 
remote nodes* of MPI and all the alternatives I have?

Thanks and regards.

From: users  on behalf of Jeff Squyres 
(jsquyres) via users 
Sent: Wednesday, December 19, 2018 7:18:23 PM
To: Open MPI User's List
Cc: Jeff Squyres (jsquyres)
Subject: Re: [OMPI users] It's possible to get mpi working without ssh?

On Dec 19, 2018, at 11:42 AM, Daniel Edreira  wrote:
>
> Does anyone know if there's a possibility to configure a cluster of nodes to 
> communicate with each other with mpirun without using SSH?
>
> Someone is asking me about making a cluster with Infiniband that does not use 
> SSH to communicate using OpenMPI.

I'm not entirely clear what you're asking.  You mention both "ssh" and 
"communication" -- they're kinda different things.

Communication: Over an InfiniBand cluster, we'd recommend that you use the UCX 
library for Open MPI to communicate over IB.  I.e., when your MPI application 
invokes API calls like MPI_Send() and MPI_Recv(), they'll use the UCX library 
underneath and utilize native IB-style communication (i.e., they're not using 
the POSIX sockets API -- they're using the native UCX/verbs APIs for RDMA OS 
bypass and offload, ...etc.).

Ssh: ssh is not used for *communication*, per se; it's used for *starting Linux 
processes on remote nodes.  Open MPI can use SSH to start processes on remote 
nodes, but it can also use other mechanisms (e.g., if you have a resource 
manager such as SLURM, Open MPI can use SLURM's native remote process launching 
mechanism instead of SSH).

These two things are orthogonal to each other: you can use whatever 
*communication* mechanism you want for MPI APIs (e.g., the POSIX sockets API or 
the UCX library or ...several other APIs...), and use whatever *process launch* 
mechanism you want (e.g., SSH or SLURM's native remote process launch or 
...several others...).  Put simply: the choice of MPI communication layer does 
not imply anything about the remote process launch mechanism, and vice versa.

Make sense?

--
Jeff Squyres
jsquy...@cisco.com

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Building PMIx and Slurm support

2019-03-03 Thread Daniel Letai


  
  
Hello,


I have built the following stack :

  centos 7.5 (gcc 4.8.5-28, libevent 2.0.21-4)
  
  MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tgz built with
--all --without-32bit (this includes ucx 1.5.0)
  
  hwloc from centos 7.5 : 1.11.8-4.el7
  pmix 3.1.2
  
  slurm 18.08.5-2 built --with-ucx --with-pmix
  openmpi 4.0.0 : configure --with-slurm --with-pmix=external
--with-pmi --with-libevent=external --with-hwloc=external
--with-knem=/opt/knem-1.1.3.90mlnx1
--with-hcoll=/opt/mellanox/hcoll

The configure part succeeds, however 'make' errors out with:
ext3x.c: In function 'ext3x_value_unload':
ext3x.c:1109:10: error: 'PMIX_MODEX' undeclared (first
  use in this function)


And same for 'PMIX_INFO_ARRAY'


However, both are declared in the
  opal/mca/pmix/pmix3x/pmix/include/pmix_common.h file.
opal/mca/pmix/ext3x/ext3x.c does include pmix_common.h but as a
  system include #include  , while ext3x.h
  includes it as a local include #include "pmix_common". Neither
  seem to pull from the correct path.


Regards,
Dani_L.



On 2/24/19 3:09 AM, Gilles Gouaillardet
  wrote:


  Passant,

you have to manually download and apply
https://github.com/pmix/pmix/commit/2e2f4445b45eac5a3fcbd409c81efe318876e659.patch
to PMIx 2.2.1
that should likely fix your problem.

As a side note,  it is a bad practice to configure --with-FOO=/usr
since it might have some unexpected side effects.
Instead, you can replace

configure --with-slurm --with-pmix=/usr --with-pmi=/usr --with-libevent=/usr

with

configure --with-slurm --with-pmix=external --with-pmi --with-libevent=external

to be on the safe side I also invite you to pass --with-hwloc=external
to the configure command line


Cheers,

Gilles

On Sun, Feb 24, 2019 at 1:54 AM Passant A. Hafez
 wrote:

  

Hello Gilles,

Here are some details:

Slurm 18.08.4

PMIx 2.2.1 (as shown in /usr/include/pmix_version.h)

Libevent 2.0.21

srun --mpi=list
srun: MPI types are...
srun: none
srun: openmpi
srun: pmi2
srun: pmix
srun: pmix_v2

Open MPI versions tested: 4.0.0 and 3.1.2


For each installation to be mentioned a different MPI Hello World program was compiled.
Jobs were submitted by sbatch, 2 node * 2 tasks per node then srun --mpi=pmix program

File 400ext_2x2.out (attached) is for OMPI 4.0.0 installation with configure options:
--with-slurm --with-pmix=/usr --with-pmi=/usr --with-libevent=/usr
and configure log:
Libevent support: external
PMIx support: External (2x)

File 400int_2x2.out (attached) is for OMPI 4.0.0 installation with configure options:
--with-slurm --with-pmix
and configure log:
Libevent support: internal (external libevent version is less that internal version 2.0.22)
PMIx support: Internal

Tested also different installations for 3.1.2 and got errors similar to 400ext_2x2.out
(NOT-SUPPORTED in file event/pmix_event_registration.c at line 101)





All the best,
--
Passant A. Hafez | HPC Applications Specialist
KAUST Supercomputing Core Laboratory (KSL)
King Abdullah University of Science and Technology
Building 1, Al-Khawarizmi, Room 0123
Mobile : +966 (0) 55-247-9568
Mobile : +20 (0) 106-146-9644
Office  : +966 (0) 12-808-0367


From: users  on behalf of Gilles Gouaillardet 
Sent: Saturday, February 23, 2019 5:17 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building PMIx and Slurm support

Hi,

PMIx has cross-version compatibility, so as long as the PMIx library
used by SLURM is compatible with the one (internal or external) used
by Open MPI, you should be fine.
If you want to minimize the risk of cross-version incompatibility,
then I encourage you to use the same (and hence external) PMIx that
was used to build SLURM with Open MPI.

Can you tell a bit more than "it didn't work" ?
(Open MPI version, PMIx version used by SLURM, PMIx version used by
Open MPI, error message, ...)

Cheers,

Gilles

On Sat, Feb 23, 2019 at 9:46 PM Passant A. Hafez
 wrote:


  

Good day everyone,

I've trying to build and use the PMIx support for Open MPI but I tried many things that I can list if needed, but with no luck.
I was able to test the PMIx client but when I used OMPI specifying srun --mpi=pmix it didn't work.

So if you please advise me with the versions of each PMIx and Open MPI that should be working well with Slurm 18.08, it'd be great.

Also, what is the difference between using internal vs external PMIx installations?



All the best,

--

Passant A. Hafez | HPC Applications Specialist
KAUST Supercomputing Core Laboratory (KSL)
King Abdullah University of Science and Technology
Building 1, Al-Khawarizmi, Room 0123
Mobile : +966 (0) 55-247-9568
Mobile : +20 (0) 106-146-9644
Office  : +966 (0) 12-808-0367
___
users mailing list
users@lists.open-mpi.or

Re: [OMPI users] Building PMIx and Slurm support

2019-03-03 Thread Daniel Letai



Sent from my iPhone

> On 3 Mar 2019, at 16:31, Gilles Gouaillardet  
> wrote:
> 
> Daniel,
> 
> PMIX_MODEX and PMIX_INFO_ARRAY have been removed from PMIx 3.1.2, and
> Open MPI 4.0.0 was not ready for this.
> 
> You can either use the internal PMIx (3.0.2), or try 4.0.1rc1 (with
> the external PMIx 3.1.2) that was published a few days ago.
> 
Thanks, will try that tomorrow. I can’t use internal due to Slurm dependency, 
but I will try the rc.
Any idea when 4.0.1 will be released?

> FWIW, you are right using --with-pmix=external (and not using 
> --with-pmix=/usr)
> 
> Cheers,
> 
> Gilles
> 
>> On Sun, Mar 3, 2019 at 10:57 PM Daniel Letai  wrote:
>> 
>> Hello,
>> 
>> 
>> I have built the following stack :
>> 
>> centos 7.5 (gcc 4.8.5-28, libevent 2.0.21-4)
>> MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tgz built with --all 
>> --without-32bit (this includes ucx 1.5.0)
>> hwloc from centos 7.5 : 1.11.8-4.el7
>> pmix 3.1.2
>> slurm 18.08.5-2 built --with-ucx --with-pmix
>> openmpi 4.0.0 : configure --with-slurm --with-pmix=external --with-pmi 
>> --with-libevent=external --with-hwloc=external 
>> --with-knem=/opt/knem-1.1.3.90mlnx1 --with-hcoll=/opt/mellanox/hcoll
>> 
>> The configure part succeeds, however 'make' errors out with:
>> 
>> ext3x.c: In function 'ext3x_value_unload':
>> 
>> ext3x.c:1109:10: error: 'PMIX_MODEX' undeclared (first use in this function)
>> 
>> 
>> And same for 'PMIX_INFO_ARRAY'
>> 
>> 
>> However, both are declared in the 
>> opal/mca/pmix/pmix3x/pmix/include/pmix_common.h file.
>> 
>> opal/mca/pmix/ext3x/ext3x.c does include pmix_common.h but as a system 
>> include #include  , while ext3x.h includes it as a local 
>> include #include "pmix_common". Neither seem to pull from the correct path.
>> 
>> 
>> Regards,
>> 
>> Dani_L.
>> 
>> 
>> On 2/24/19 3:09 AM, Gilles Gouaillardet wrote:
>> 
>> Passant,
>> 
>> you have to manually download and apply
>> https://github.com/pmix/pmix/commit/2e2f4445b45eac5a3fcbd409c81efe318876e659.patch
>> to PMIx 2.2.1
>> that should likely fix your problem.
>> 
>> As a side note,  it is a bad practice to configure --with-FOO=/usr
>> since it might have some unexpected side effects.
>> Instead, you can replace
>> 
>> configure --with-slurm --with-pmix=/usr --with-pmi=/usr --with-libevent=/usr
>> 
>> with
>> 
>> configure --with-slurm --with-pmix=external --with-pmi 
>> --with-libevent=external
>> 
>> to be on the safe side I also invite you to pass --with-hwloc=external
>> to the configure command line
>> 
>> 
>> Cheers,
>> 
>> Gilles
>> 
>> On Sun, Feb 24, 2019 at 1:54 AM Passant A. Hafez
>>  wrote:
>> 
>> Hello Gilles,
>> 
>> Here are some details:
>> 
>> Slurm 18.08.4
>> 
>> PMIx 2.2.1 (as shown in /usr/include/pmix_version.h)
>> 
>> Libevent 2.0.21
>> 
>> srun --mpi=list
>> srun: MPI types are...
>> srun: none
>> srun: openmpi
>> srun: pmi2
>> srun: pmix
>> srun: pmix_v2
>> 
>> Open MPI versions tested: 4.0.0 and 3.1.2
>> 
>> 
>> For each installation to be mentioned a different MPI Hello World program 
>> was compiled.
>> Jobs were submitted by sbatch, 2 node * 2 tasks per node then srun 
>> --mpi=pmix program
>> 
>> File 400ext_2x2.out (attached) is for OMPI 4.0.0 installation with configure 
>> options:
>> --with-slurm --with-pmix=/usr --with-pmi=/usr --with-libevent=/usr
>> and configure log:
>> Libevent support: external
>> PMIx support: External (2x)
>> 
>> File 400int_2x2.out (attached) is for OMPI 4.0.0 installation with configure 
>> options:
>> --with-slurm --with-pmix
>> and configure log:
>> Libevent support: internal (external libevent version is less that internal 
>> version 2.0.22)
>> PMIx support: Internal
>> 
>> Tested also different installations for 3.1.2 and got errors similar to 
>> 400ext_2x2.out
>> (NOT-SUPPORTED in file event/pmix_event_registration.c at line 101)
>> 
>> 
>> 
>> 
>> 
>> All the best,
>> --
>> Passant A. Hafez | HPC Applications Specialist
>> KAUST Supercomputing Core Laboratory (KSL)
>> King Abdullah University of Science and Technology
>> Building 1, Al-Khawarizmi, Room 0123
>> Mobile : +966 (0) 55-247-9568
>> Mobile : +20 (0) 106-146-9644
>&g

Re: [OMPI users] Building PMIx and Slurm support

2019-03-03 Thread Daniel Letai


  
  
Gilles,


On 04/03/2019 01:59:28, Gilles
  Gouaillardet wrote:

Daniel,
  
  
  
  keep in mind PMIx was designed with cross-version compatibility in
  mind,
  
  
  so a PMIx 3.0.2 client (read Open MPI 4.0.0 app with the internal
  3.0.2 PMIx) should be able
  
  
  to interact with a PMIx 3.1.2 server (read SLURM pmix plugin built
  on top of PMIx 3.1.2).
  

Good to know - I did not find that information and was hesitant to
mix and match.

  
  So unless you have a specific reason not to mix both, you might
  also give the internal PMIx a try.
  

Does this hold true for libevent too? Configure complains if
libevent for openmpi is different than the one used for the other
tools.

  
  
  The 4.0.1 release candidate 1 was released a few days ago, and
  based on the feedback we receive,
  
  
  the final 4.0.1 should be released in a very near future.
  

Thanks for the info.

  
  
  Cheers,
  
  
  
  Gilles
  
  



Cheers,
Dani_L

On
  3/4/2019 1:08 AM, Daniel Letai wrote:
  
  

Sent from my iPhone


On 3 Mar 2019, at 16:31, Gilles
  Gouaillardet  wrote:
  
  
  Daniel,
  
  
  PMIX_MODEX and PMIX_INFO_ARRAY have been removed from PMIx
  3.1.2, and
  
  Open MPI 4.0.0 was not ready for this.
  
  
  You can either use the internal PMIx (3.0.2), or try 4.0.1rc1
  (with
  
  the external PMIx 3.1.2) that was published a few days ago.
  
  

Thanks, will try that tomorrow. I can’t use internal due to
Slurm dependency, but I will try the rc.

Any idea when 4.0.1 will be released?


FWIW, you are right using
  --with-pmix=external (and not using --with-pmix=/usr)
  
  
  Cheers,
  
  
  Gilles
  
  
  On Sun, Mar 3, 2019 at 10:57 PM Daniel
Letai  wrote:


Hello,



I have built the following stack :


centos 7.5 (gcc 4.8.5-28, libevent 2.0.21-4)

MLNX_OFED_LINUX-4.5-1.0.1.0-rhel7.5-x86_64.tgz built with
--all --without-32bit (this includes ucx 1.5.0)

hwloc from centos 7.5 : 1.11.8-4.el7

pmix 3.1.2

slurm 18.08.5-2 built --with-ucx --with-pmix

openmpi 4.0.0 : configure --with-slurm --with-pmix=external
--with-pmi --with-libevent=external --with-hwloc=external
--with-knem=/opt/knem-1.1.3.90mlnx1
--with-hcoll=/opt/mellanox/hcoll


The configure part succeeds, however 'make' errors out with:


ext3x.c: In function 'ext3x_value_unload':


ext3x.c:1109:10: error: 'PMIX_MODEX' undeclared (first use
in this function)



And same for 'PMIX_INFO_ARRAY'



However, both are declared in the
opal/mca/pmix/pmix3x/pmix/include/pmix_common.h file.


opal/mca/pmix/ext3x/ext3x.c does include pmix_common.h but
as a system include #include  , while
ext3x.h includes it as a local include #include
"pmix_common". Neither seem to pull from the correct path.



Regards,


Dani_L.



On 2/24/19 3:09 AM, Gilles Gouaillardet wrote:


Passant,


you have to manually download and apply

https://github.com/pmix/pmix/commit/2e2f4445b45eac5a3fcbd409c81efe318876e659.patch

to PMIx 2.2.1

that should likely fix your problem.


As a side note, it is a bad practice to configure
--with-FOO=/usr

since it might have some unexpected side effects.

Instead, you can replace


configure --with-slurm --with-pmix=/usr --with-pmi=/usr
--with-libevent=/usr


with


configure --with-slurm --with-pmix=external --with-pmi
--with-libevent=external

Re: [OMPI users] Building PMIx and Slurm support

2019-03-04 Thread Daniel Letai


  
  
Gilles,
On 3/4/19 8:28 AM, Gilles Gouaillardet
  wrote:

Daniel,
  
  
  
  On 3/4/2019 3:18 PM, Daniel Letai wrote:
  
  

So unless you have a specific reason not
  to mix both, you might also give the internal PMIx a try.
  

Does this hold true for libevent too? Configure complains if
libevent for openmpi is different than the one used for the
other tools.


  
  
  I am not exactly sure of which scenario you are running.
  
  
  Long story short,
  
  
   - If you use an external PMIx, then you have to use an external
  libevent (otherwise configure will fail).
  
  
     It must be the same one used by PMIx, but I am not sure
  configure checks that.
  
  
  - If you use the internal PMIx, then it is up to you. you can
  either use the internal libevent, or an external one.
  
  

Thanks, that clarifies the issues I've experienced. Since PMIx
  doesn't have to be the same for server and nodes, I can compile
  slurm with external PMIx with system libevent, and compile openmpi
  with internal PMIx and libevent, and that should work. Is that
  correct?


BTW, building 4.0.1rc1 completed successfully using external for
  all, will start testing in near future.

  
  Cheers,
  
  
  
  Gilles
  
  

Thanks,
Dani_L.

___
  
  users mailing list
  
  users@lists.open-mpi.org
  
  https://lists.open-mpi.org/mailman/listinfo/users

  

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Building PMIx and Slurm support

2019-03-12 Thread Daniel Letai


  
  
Hi,
On 12/03/2019 10:46:02, Passant A.
  Hafez wrote:


  Hi Gilles,

Yes it was just a typo in the last email, it was correctly spelled in the job script.

So I just tried to use 1 node * 2 tasks/node, I got the same error I posted before, just a copy for each process, here it is again:

*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***and potentially your MPI job)
[cn603-20-l:169109] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***and potentially your MPI job)
[cn603-20-l:169108] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
srun: error: cn603-20-l: tasks 0-1: Exited with exit code 1


I'm suspecting Slurm, but anyways, how can I troubleshoot this? 

Simple - try running directly without Slurm.
If it works - Slurm is the culprit. If not - it's MPI debug time.


  
The program is a simple MPI Hello World code.




All the best,
--
Passant A. Hafez | HPC Applications Specialist
KAUST Supercomputing Core Laboratory (KSL)
King Abdullah University of Science and Technology
Building 1, Al-Khawarizmi, Room 0123
Mobile : +966 (0) 55-247-9568
Mobile : +20 (0) 106-146-9644
Office  : +966 (0) 12-808-0367


From: users  on behalf of Gilles Gouaillardet 
Sent: Tuesday, March 12, 2019 8:22 AM
To: users@lists.open-mpi.org
Subject: Re: [OMPI users] Building PMIx and Slurm support

Passant,


Except the typo (it should be srun --mpi=pmix_v3), there is nothing
wrong with that, and it is working just fine for me

(same SLURM version, same PMIx version, same Open MPI version and same
Open MPI configure command line)

that is why I asked you some more information/logs in order to
investigate your issue.


You might want to try a single node job first in order to rule out
potential interconnect related issues.


Cheers,


Gilles


On 3/12/2019 1:54 PM, Passant A. Hafez wrote:

  
Hello Gilles,

Yes I do use srun --mpi=pmix_3 to run the app, what's the problem with
that?
Before that, when we tried to launch MPI apps directly with srun, we
got the error message saying Slurm missed the PMIx support, that's why
we proceeded with the installation.



All the best,

--

Passant

On Mar 12, 2019 6:53 AM, Gilles Gouaillardet  wrote:
Passant,


I built a similar environment, and had no issue running a simple MPI
program.


Can you please post your slurm script (I assume it uses srun to start
the MPI app),

the output of

scontrol show config | grep Mpi

and the full output of your job ?


Cheers,


Gilles


On 3/12/2019 7:59 AM, Passant A. Hafez wrote:


  
Hello,


So we now have Slurm 18.08.6-2 compiled with PMIx 3.1.2

then I installed openmpi 4.0.0 with:

--with-slurm  --with-pmix=internal --with-libevent=internal
--enable-shared --enable-
static  --with-x


(Following the thread, it was mentioned that building OMPI 4.0.0 with
PMIx 3.1.2 will fail with PMIX_MODEX and PMIX_INFO_ARRAY errors, so I
used internal PMIx)



The MPI program fails with:


*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***and potentially your MPI job)
[cn603-13-r:387088] Local abort before MPI_INIT completed completed
successfully, but am not able to aggregate error messages, and not
able to guarantee that all other processes were killed!


for each process, please advise! what's going wrong here?







All the best,
--
Passant A. Hafez | HPC Applications Specialist
KAUST Supercomputing Core Laboratory (KSL)
King Abdullah University of Science and Technology
Building 1, Al-Khawarizmi, Room 0123
Mobile : +966 (0) 55-247-9568
Mobile : +20 (0) 106-146-9644
Office : +966 (0) 12-808-0367

*From:* users  on behalf of Ralph H
Castain 
*Sent:* Monday, March 4, 2019 5:29 PM
*To:* Open MPI Users
*Subject:* Re: [OMPI users] Building PMIx and Slurm support



  
On Mar 4, 2019, at 5:34 AM, Daniel Letai <d...@letai.org.il
> wrote:

Gilles,
On 3/4/19 8:28 AM, Gilles Gouaillardet wrote:

    
  Daniel,


On 3/4/2019 3:18 PM, Daniel Letai wrote:

  



  So unless you have a specific reason not to mix both, you might
also give the internal PMIx a try.


Does this hold true for libevent too? Configure complains if
libevent for openmpi is different than t

[OMPI users] Are there any issues (performance or otherwise) building apps with different compiler from the one used to build openmpi?

2019-03-20 Thread Daniel Letai


  
  
Hello,


Assuming I have installed openmpi built with distro stock
  gcc(4.4.7 on rhel 6.5), but an app requires a different gcc
  version (8.2 manually built on dev machine).


Would there be any issues, or performance penalty, if building
  the app using the more recent gcc with flags from wrapper
  compiler's --showme as per
  https://www.open-mpi.org/faq/?category=mpi-apps#cant-use-wrappers
  ?
Openmpi is built with both pmix and ucx enabled, all built with
  stock gcc(4.4.7).



Since the constraint is the app, if the answer is yes I would
  have to build openmpi using non-distro gcc which is a bit of a
  hassle.


Thanks in advance
--Dani_L.

  

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] Packaging issue with linux spec file when not build_all_in_one_rpm due to empty grep

2019-04-16 Thread Daniel Letai


  
  
In src rpm version 4.0.1 if building with --define
  'build_all_in_one_rpm 0' the grep -v _mandir docs.files is empty.


The simple workaround is to follow earlier pattern and pipe to
  /bin/true, as the spec doesn't really care if the file is empty.
  I'm wondering if not all greps should be protected.



A simple patch:
diff --git a/contrib/dist/linux/openmpi.spec
  b/contrib/dist/linux/openmpi.spec
  index 2a80af296b..2b897345f9 100644
  --- a/contrib/dist/linux/openmpi.spec
  +++ b/contrib/dist/linux/openmpi.spec
  @@ -611,7 +611,7 @@ grep -v %{_includedir} devel.files >
  tmp.files
   mv tmp.files devel.files
   
   # docs sub package
  -grep -v %{_mandir} docs.files > tmp.files
  +grep -v %{_mandir} docs.files > tmp.files | /bin/true
   mv tmp.files docs.files
   
   %endif






  

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] TCP usage in MPI singletons

2019-04-17 Thread Daniel Hemberger

Hi everyone,

I've been trying to track down the source of TCP connections when running
MPI singletons, with the goal of avoiding all TCP communication to free up
ports for other processes. I have a local apt install of OpenMPI 2.1.1 on
Ubuntu 18.04 which does not establish any TCP connections by default,
either when run as "mpirun -np 1 ./program" or "./program". But it has
non-TCP alternatives for both the BTL (vader, self, etc.) and OOB (ud and
usock) frameworks, so I was not surprised by this result.

On a remote machine, I'm running the same test with an assortment of
OpenMPI versions (1.6.4, 1.8.6, 4.0.0, 4.0.1 on RHEL6 and 1.10.7 on RHEL7).
In all but 1.8.6 and 1.10.7, there is always a TCP connection established,
even if I disable the TCP BTL on the command line (e.g. "mpirun --mca btl
^tcp"). Therefore, I assumed this was because `tcp` was the only OOB
interface available in these installations. This TCP connection is
established both for "mpirun -np 1 ./program" and "./program".

The confusing part is that the 1.8.6 and 1.10.7 installations only appear
to establish a TCP connection when invoked with "mpirun -np 1 ./program",
but _not_ with "./program", even though its only OOB interface was also
`tcp`. This result was not consistent with my understanding, so now I am
confused about when I should expect TCP communication to occur.

Is there a known explanation for what I am seeing? Is there actually a way
to get singletons to forego all TCP communication, even if TCP is the only
OOB available, or is there something else at play here? I'd be happy to
provide any config.log files or ompi_info output if it would help.

For more context, the underlying issue I'm trying to resolve is that we are
(unfortunately) running many short instances of mpirun, and the TCP
connections are piling up in the TIME_WAIT state because they aren't
cleaned up faster than we create them.

Any advice or pointers would be greatly appreciated!

Thanks,
-Dan
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] TCP usage in MPI singletons

2019-04-19 Thread Daniel Hemberger

Hi Gilles, all,

Using `OMPI_MCA_ess_singleton_isolated=true ./program` achieves the desired
result of establishing no TCP connections for a singleton execution.

Thank you for the suggestion!

Best regards,
-Dan

On Wed, Apr 17, 2019 at 5:35 PM Gilles Gouaillardet 
wrote:

> Daniel,
>
>
> If your MPI singleton will never MPI_Comm_spawn(), then you can use the
> isolated mode like this
>
> OMPI_MCA_ess_singleton_isolated=true ./program
>
>
> You can also save some ports by blacklisting the btl/tcp component
>
>
> OMPI_MCA_ess_singleton_isolated=true OMPI_MCA_pml=ob1
> OMPI_MCA_btl=vader,self ./program
>
>
> Cheers,
>
>
> Gilles
>
> On 4/18/2019 3:51 AM, Daniel Hemberger wrote:
> > Hi everyone,
> >
> > I've been trying to track down the source of TCP connections when
> > running MPI singletons, with the goal of avoiding all TCP
> > communication to free up ports for other processes. I have a local apt
> > install of OpenMPI 2.1.1 on Ubuntu 18.04 which does not establish any
> > TCP connections by default, either when run as "mpirun -np 1
> > ./program" or "./program". But it has non-TCP alternatives for both
> > the BTL (vader, self, etc.) and OOB (ud and usock) frameworks, so I
> > was not surprised by this result.
> >
> > On a remote machine, I'm running the same test with an assortment of
> > OpenMPI versions (1.6.4, 1.8.6, 4.0.0, 4.0.1 on RHEL6 and 1.10.7 on
> > RHEL7). In all but 1.8.6 and 1.10.7, there is always a TCP connection
> > established, even if I disable the TCP BTL on the command line (e.g.
> > "mpirun --mca btl ^tcp"). Therefore, I assumed this was because `tcp`
> > was the only OOB interface available in these installations. This TCP
> > connection is established both for "mpirun -np 1 ./program" and
> > "./program".
> >
> > The confusing part is that the 1.8.6 and 1.10.7 installations only
> > appear to establish a TCP connection when invoked with "mpirun -np 1
> > ./program", but _not_ with "./program", even though its only OOB
> > interface was also `tcp`. This result was not consistent with my
> > understanding, so now I am confused about when I should expect TCP
> > communication to occur.
> >
> > Is there a known explanation for what I am seeing? Is there actually a
> > way to get singletons to forego all TCP communication, even if TCP is
> > the only OOB available, or is there something else at play here? I'd
> > be happy to provide any config.log files or ompi_info output if it
> > would help.
> >
> > For more context, the underlying issue I'm trying to resolve is that
> > we are (unfortunately) running many short instances of mpirun, and the
> > TCP connections are piling up in the TIME_WAIT state because they
> > aren't cleaned up faster than we create them.
> >
> > Any advice or pointers would be greatly appreciated!
> >
> > Thanks,
> > -Dan
> >
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] Beowulf cluster and openmpi

2008-11-05 Thread Daniel Gruner

Can your nodes see the openmpi libraries and executables?  I have the
/usr/local and /opt from the master node mounted on the compute nodes,
in addition to having the LD_LIBRARY_PATH defined correctly.  In your
case the nodes must be able to see /home/rchaud/openmpi-1.2.6 in order
to get the libraries and executables, so this directory must be mounted
on the nodes.  You don't want to copy all this stuff to the nodes in a
bproc environment, since it would eat away at your ram.

Daniel

On Wed, Nov 05, 2008 at 12:44:03PM -0600, Rima Chaudhuri wrote:
> Thanks for all your help Ralph and Sean!!
> I changed the machinefile to just containing the node numbers. I added
> the env variable NODES in my .bash_profile and .bashrc.
> As per Sean's suggestion I added the $LD_LIBRARY_PATH (shared lib path
> which the openmpi lib directory path) and the $AMBERHOME/lib  as 2 of
> the libraries' path in the config file of beowulf. I also checked by
> bpsh from one of the compute nodes whether it can see the executables
> which is in $AMBERHOME/exe and the mpirun(OMPI):
> I get the following error message:
> 
> [rchaud@helios amber10]$ ./step1
> --
> A daemon (pid 25319) launched by the bproc PLS component on node 2 died
> unexpectedly on signal 13 so we are aborting.
> 
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --
> [helios.structure.uic.edu:25317] [0,0,0] ORTE_ERROR_LOG: Error in file
> pls_bproc.c at line 717
> [helios.structure.uic.edu:25317] [0,0,0] ORTE_ERROR_LOG: Error in file
> pls_bproc.c at line 1164
> [helios.structure.uic.edu:25317] [0,0,0] ORTE_ERROR_LOG: Error in file
> rmgr_urm.c at line 462
> [helios.structure.uic.edu:25317] mpirun: spawn failed with errno=-1
> 
> 
> I tested to see if the compute nodes could see the master by the
> following commands:
> 
> [rchaud@helios amber10]$ bpsh 2 echo $LD_LIBRARY_PATH
> /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib
> [rchaud@helios amber10]$ bpsh 2 echo $AMBERHOME
> /home/rchaud/Amber10_openmpi/amber10
> [rchaud@helios amber10]$ bpsh 2 ls -al
> total 11064
> drwxr-xr-x   11 rchaud   04096 Nov  5 11:33 .
> drwxr-xr-x3 rchaud   100  4096 Oct 20 17:21 ..
> -rw-r--r--1 128  53   1201 Jul 10 17:08 Changelog_at
> -rw-rw-r--1 128  53  25975 Feb 28  2008
> GNU_Lesser_Public_License
> -rw-rw1 128  53   3232 Mar 30  2008 INSTALL
> -rw-rw-r--1 128  53  20072 Feb 11  2008 LICENSE_at
> -rw-r--r--1 00 1814241 Oct 31 13:32 PLP_617_xtal_nolig.crd
> -rw-r--r--1 00 8722770 Oct 31 13:31 PLP_617_xtal_nolig.top
> -rw-rw-r--1 128  53   1104 Mar 18  2008 README
> -rw-r--r--1 128  53   1783 Jun 23 19:43 README_at
> drwxrwxr-x   10 128  53   4096 Oct 20 17:23 benchmarks
> drwxr-xr-x2 004096 Oct 20 18:21 bin
> -rw-r--r--1 00  642491 Oct 20 17:51 bugfix.all
> drwxr-xr-x   13 004096 Oct 20 17:37 dat
> drwxr-xr-x3 004096 Oct 20 17:23 doc
> drwxrwxr-x9 128  53   4096 Oct 20 17:23 examples
> lrwxrwxrwx1 00   3 Oct 20 17:34 exe -> bin
> drwxr-xr-x2 004096 Oct 20 17:35 include
> drwxr-xr-x2 004096 Oct 20 17:36 lib
> -rw-r--r--1 rchaud   10030 Nov  5 11:33 machinefile
> -rw-r--r--1 rchaud   100   161 Nov  5 12:11 min
> drwxrwxr-x   40 128  53   4096 Oct 20 17:50 src
> -rwxr-xr-x1 rchaud   100   376 Nov  3 16:41 step1
> drwxrwxr-x  114 128  53   4096 Oct 20 17:23 test
> 
> [rchaud@helios amber10]$ bpsh 2 which mpirun
> /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/bin/mpirun
> 
> The $LD_LIBRARY_PATH seems to be defined correctly, but then why is it
> not being read?
> 
> thanks
> 
> On Wed, Nov 5, 2008 at 11:08 AM,   wrote:
> > Send users mailing list submissions to
> >us...@open-mpi.org
> >
> > To subscribe or unsubscribe via the World Wide Web, visit
> >http://www.open-mpi.org/mailman/listinfo.cgi/users
> > or, via email, send a message with subject or body 'help' to
> >users-requ...@open-mpi.org
> >
> > You can reach the person managing the list at
> >users-ow...@open-mpi.org
&

[OMPI users] problem with overlapping communication with calculation

2009-03-25 Thread Daniel Spångberg


Dear list,

We've found a problem with openmpi when running over IB when calculation  
reading elements of an array is overlapping communication to other  
elements (that are not used in the calculation) of the same array. I have  
written a small test program (below) that shows this behaviour. When the  
array is small (arrlen in the code), more problems occur. The problems  
only occur when using IB (even on the same node!?), using mpirun -mca btl  
tcp,self the problem vanishes.


The behaviour with 1.2.9 and 1.3.1 is slightly different, where problems  
occur already for 3 processes with openmpi 1.2.9 but 4 processes are  
required for problems with 1.3.1. Proper output on 4 processes should just  
be:

Sum should be 60
Sum should be 60
Sum should be 60
Sum should be 60

With IB:
mpirun  -np 4 ./test3|head
Sum should be 60
Sum should be 60
Sum should be 60
Sum should be 60
Result on rank 0 strangely is 1.06316e+248
Result on rank 2 strangely is 1.54396e+262
Result on rank 3 strangely is 3.87325e+233
Result on rank 1 strangely is 1.54396e+262
Result on rank 1 strangely is 1.54396e+262
Result on rank 2 strangely is 1.54396e+262


Info about the system:

openmpi: 1.2.9, 1.3.1

From ompi_info:
   MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.1)

From lspci:
04:00.0 InfiniBand: Mellanox Technologies MT23108 InfiniHost (rev a1)

configure picks up ibverbs:
--- MCA component btl:ofud (m4 configuration macro)
checking for MCA component btl:ofud compile mode... dso
checking --with-openib value... simple ok (unspecified)
checking --with-openib-libdir value... simple ok (unspecified)
checking for fcntl.h... (cached) yes
checking sys/poll.h usability... yes
checking sys/poll.h presence... yes
checking for sys/poll.h... yes
checking infiniband/verbs.h usability... yes
checking infiniband/verbs.h presence... yes
checking for infiniband/verbs.h... yes
looking for library without search path
checking for ibv_open_device in -libverbs... yes
checking number of arguments to ibv_create_cq... 5
checking whether IBV_EVENT_CLIENT_REREGISTER is declared... yes
checking for ibv_get_device_list... yes
checking for ibv_resize_cq... yes
checking for struct ibv_device.transport_type... yes
checking for ibv_create_xrc_rcv_qp... no
checking rdma/rdma_cma.h usability... yes
checking rdma/rdma_cma.h presence... yes
checking for rdma/rdma_cma.h... yes
checking for rdma_create_id in -lrdmacm... yes
checking for rdma_get_peer_addr... yes
checking for infiniband/driver.h... yes
checking if ConnectX XRC support is enabled... no
checking if OpenFabrics RDMACM support is enabled... yes
checking if OpenFabrics IBCM support is enabled... no
checking if MCA component btl:ofud can compile... yes

--- MCA component btl:openib (m4 configuration macro)
checking for MCA component btl:openib compile mode... dso
checking --with-openib value... simple ok (unspecified)
checking --with-openib-libdir value... simple ok (unspecified)
checking for fcntl.h... (cached) yes
checking for sys/poll.h... (cached) yes
checking infiniband/verbs.h usability... yes
checking infiniband/verbs.h presence... yes
checking for infiniband/verbs.h... yes
looking for library without search path
checking for ibv_open_device in -libverbs... yes
checking number of arguments to ibv_create_cq... (cached) 5
checking whether IBV_EVENT_CLIENT_REREGISTER is declared... (cached) yes
checking for ibv_get_device_list... (cached) yes
checking for ibv_resize_cq... (cached) yes
checking for struct ibv_device.transport_type... (cached) yes
checking for ibv_create_xrc_rcv_qp... (cached) no
checking for rdma/rdma_cma.h... (cached) yes
checking for rdma_create_id in -lrdmacm... (cached) yes
checking for rdma_get_peer_addr... yes
checking for infiniband/driver.h... (cached) yes
checking if ConnectX XRC support is enabled... no
checking if OpenFabrics RDMACM support is enabled... yes
checking if OpenFabrics IBCM support is enabled... no
checking for ibv_fork_init... yes
checking for thread support (needed for ibcm/rdmacm)... posix
checking which openib btl cpcs will be built... oob rdmacm
checking if MCA component btl:openib can compile... yes


Compilers: gcc 4.1.2 and pgcc 8.0-4 same problems, optimization level does  
not matter. (-fast, -O3 or -O0) (64 bit)


CPU: opteron 250
OS: Scientific linux 5.2

If you require any more information, I'll be more than happy to provide it!

Is this a proper way to overlap communication with calculation? Could this  
be some kind of cache-coherency problem? values in cpu cache already but  
rdma puts things in memory, although in that case I would expect the sum  
not to be that off? What would happen if the compiler decided to do  
non-temporal prefetches (or stores in the general case)?




The code:

#include 
#include 
#include 


int main(int argc, char **argv)
{
  int rank,size,i,j,k;
  const int arrlen=10;
  const int repeattest=100;
  double *array;
  MPI_Request *reqarr;
  MPI_Status *mpistat;
  MPI_Datatype STRIDED;
  int torank,fromrank,nre

Re: [OMPI users] problem with overlapping communication with calculation

2009-03-25 Thread Daniel Spångberg


Dear list,

A colleague pointed out an error in my test code. The final loop should  
not be

 for (i=0; idetails, details... Anyway, I still get problems from time to time with  
this test code, but I have not yet had time to figure out the  
circumstances when this happens. I will report back to this list once I  
know what's going on.


Sorry to trouble you too early!

Daniel Spångberg


Den 2009-03-25 09:44:37 skrev Daniel Spångberg :


Dear list,

We've found a problem with openmpi when running over IB when calculation  
reading elements of an array is overlapping communication to other  
elements (that are not used in the calculation) of the same array. I  
have written a small test program (below) that shows this behaviour.  
When the array is small (arrlen in the code), more problems occur. The  
problems only occur when using IB (even on the same node!?), using  
mpirun -mca btl tcp,self the problem vanishes.


The behaviour with 1.2.9 and 1.3.1 is slightly different, where problems  
occur already for 3 processes with openmpi 1.2.9 but 4 processes are  
required for problems with 1.3.1. Proper output on 4 processes should  
just be:

Sum should be 60
Sum should be 60
Sum should be 60
Sum should be 60

With IB:
mpirun  -np 4 ./test3|head
Sum should be 60
Sum should be 60
Sum should be 60
Sum should be 60
Result on rank 0 strangely is 1.06316e+248
Result on rank 2 strangely is 1.54396e+262
Result on rank 3 strangely is 3.87325e+233
Result on rank 1 strangely is 1.54396e+262
Result on rank 1 strangely is 1.54396e+262
Result on rank 2 strangely is 1.54396e+262


Info about the system:

openmpi: 1.2.9, 1.3.1

 From ompi_info:
MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.1)

 From lspci:
04:00.0 InfiniBand: Mellanox Technologies MT23108 InfiniHost (rev a1)

configure picks up ibverbs:
--- MCA component btl:ofud (m4 configuration macro)
checking for MCA component btl:ofud compile mode... dso
checking --with-openib value... simple ok (unspecified)
checking --with-openib-libdir value... simple ok (unspecified)
checking for fcntl.h... (cached) yes
checking sys/poll.h usability... yes
checking sys/poll.h presence... yes
checking for sys/poll.h... yes
checking infiniband/verbs.h usability... yes
checking infiniband/verbs.h presence... yes
checking for infiniband/verbs.h... yes
looking for library without search path
checking for ibv_open_device in -libverbs... yes
checking number of arguments to ibv_create_cq... 5
checking whether IBV_EVENT_CLIENT_REREGISTER is declared... yes
checking for ibv_get_device_list... yes
checking for ibv_resize_cq... yes
checking for struct ibv_device.transport_type... yes
checking for ibv_create_xrc_rcv_qp... no
checking rdma/rdma_cma.h usability... yes
checking rdma/rdma_cma.h presence... yes
checking for rdma/rdma_cma.h... yes
checking for rdma_create_id in -lrdmacm... yes
checking for rdma_get_peer_addr... yes
checking for infiniband/driver.h... yes
checking if ConnectX XRC support is enabled... no
checking if OpenFabrics RDMACM support is enabled... yes
checking if OpenFabrics IBCM support is enabled... no
checking if MCA component btl:ofud can compile... yes

--- MCA component btl:openib (m4 configuration macro)
checking for MCA component btl:openib compile mode... dso
checking --with-openib value... simple ok (unspecified)
checking --with-openib-libdir value... simple ok (unspecified)
checking for fcntl.h... (cached) yes
checking for sys/poll.h... (cached) yes
checking infiniband/verbs.h usability... yes
checking infiniband/verbs.h presence... yes
checking for infiniband/verbs.h... yes
looking for library without search path
checking for ibv_open_device in -libverbs... yes
checking number of arguments to ibv_create_cq... (cached) 5
checking whether IBV_EVENT_CLIENT_REREGISTER is declared... (cached) yes
checking for ibv_get_device_list... (cached) yes
checking for ibv_resize_cq... (cached) yes
checking for struct ibv_device.transport_type... (cached) yes
checking for ibv_create_xrc_rcv_qp... (cached) no
checking for rdma/rdma_cma.h... (cached) yes
checking for rdma_create_id in -lrdmacm... (cached) yes
checking for rdma_get_peer_addr... yes
checking for infiniband/driver.h... (cached) yes
checking if ConnectX XRC support is enabled... no
checking if OpenFabrics RDMACM support is enabled... yes
checking if OpenFabrics IBCM support is enabled... no
checking for ibv_fork_init... yes
checking for thread support (needed for ibcm/rdmacm)... posix
checking which openib btl cpcs will be built... oob rdmacm
checking if MCA component btl:openib can compile... yes


Compilers: gcc 4.1.2 and pgcc 8.0-4 same problems, optimization level  
does not matter. (-fast, -O3 or -O0) (64 bit)


CPU: opteron 250
OS: Scientific linux 5.2

If you require any more information, I'll be more than happy to provide  
it!


Is this a proper way to overlap communication with calculation? Could  
this be some kind of cache-coherency problem? values in cpu cache  
already but r

Re: [OMPI users] problem with overlapping communication with calculation

2009-03-25 Thread Daniel Spångberg


Dear list,

The bad behaviour now only occurs with version 1.2.X of openmpi (I have  
tried 1.2.5, 1.2.8 and 1.2.9 with gcc and 1.2.7 and 1.2.9 with pgi cc.  
Problem is in all of those.). With 1.3.1 I can find no problem at all. So  
perhaps that means that the problem is solved?


mpirun -np 4 ./test4|head
Sum should be 60
Sum should be 60
Sum should be 60
Sum should be 60
Result on rank 1 strangely is 50
Result on rank 1 strangely is 30
Result on rank 3 strangely is 90
Result on rank 3 strangely is 80
Result on rank 0 strangely is 50
Result on rank 1 strangely is 40

Without IB there is no problem:
mpirun -mca btl self,tcp -np 4 ./test4
Sum should be 60
Sum should be 60
Sum should be 60
Sum should be 60

The full (bug fixed code):

#include 
#include 
#include 


int main(int argc, char **argv)
{
  int rank,size,i,j,k;
  const int arrlen=10;
  const int repeattest=100;
  double *array;
  MPI_Request *reqarr;
  MPI_Status *mpistat;
  MPI_Datatype STRIDED;
  int torank,fromrank,nreq;
  int sumshouldbe;
  MPI_Init(&argc,&argv);
  MPI_Comm_rank(MPI_COMM_WORLD,&rank);
  MPI_Comm_size(MPI_COMM_WORLD,&size);

  /* Non-contiguous data */
  MPI_Type_vector(arrlen,1,size,MPI_DOUBLE,&STRIDED);
  MPI_Type_commit(&STRIDED);

  array=malloc(arrlen*size *sizeof *array);
  reqarr=malloc(2*size*sizeof *reqarr);
  mpistat=malloc(2*size*sizeof *mpistat);

  /* Setup communication */
  sumshouldbe=0;
  nreq=0;
  for (i=1; i=size)
torank-=size;
  fromrank=rank-i;
  if (fromrank<0)
fromrank+=size;
  MPI_Recv_init(array+i,1,STRIDED,fromrank,i,MPI_COMM_WORLD,reqarr+nreq);
  nreq++;
  MPI_Send_init(array,1,STRIDED,torank,i,MPI_COMM_WORLD,reqarr+nreq);
  nreq++;
  sumshouldbe+=i;
}
  printf("Sum should be %g\n",(double)arrlen*sumshouldbe);
  /* Do the tests. */
  for (j=0; j:


Dear list,

A colleague pointed out an error in my test code. The final loop should  
not be

  for (i=0; idetails, details... Anyway, I still get problems from time to time with  
this test code, but I have not yet had time to figure out the  
circumstances when this happens. I will report back to this list once I  
know what's going on.


Sorry to trouble you too early!

Daniel Spångberg


Den 2009-03-25 09:44:37 skrev Daniel Spångberg :


Dear list,

We've found a problem with openmpi when running over IB when  
calculation reading elements of an array is overlapping communication  
to other elements (that are not used in the calculation) of the same  
array. I have written a small test program (below) that shows this  
behaviour. When the array is small (arrlen in the code), more problems  
occur. The problems only occur when using IB (even on the same node!?),  
using mpirun -mca btl tcp,self the problem vanishes.


The behaviour with 1.2.9 and 1.3.1 is slightly different, where  
problems occur already for 3 processes with openmpi 1.2.9 but 4  
processes are required for problems with 1.3.1. Proper output on 4  
processes should just be:

Sum should be 60
Sum should be 60
Sum should be 60
Sum should be 60

With IB:
mpirun  -np 4 ./test3|head
Sum should be 60
Sum should be 60
Sum should be 60
Sum should be 60
Result on rank 0 strangely is 1.06316e+248
Result on rank 2 strangely is 1.54396e+262
Result on rank 3 strangely is 3.87325e+233
Result on rank 1 strangely is 1.54396e+262
Result on rank 1 strangely is 1.54396e+262
Result on rank 2 strangely is 1.54396e+262


Info about the system:

openmpi: 1.2.9, 1.3.1

 From ompi_info:
MCA btl: openib (MCA v2.0, API v2.0, Component v1.3.1)

 From lspci:
04:00.0 InfiniBand: Mellanox Technologies MT23108 InfiniHost (rev a1)

configure picks up ibverbs:
--- MCA component btl:ofud (m4 configuration macro)
checking for MCA component btl:ofud compile mode... dso
checking --with-openib value... simple ok (unspecified)
checking --with-openib-libdir value... simple ok (unspecified)
checking for fcntl.h... (cached) yes
checking sys/poll.h usability... yes
checking sys/poll.h presence... yes
checking for sys/poll.h... yes
checking infiniband/verbs.h usability... yes
checking infiniband/verbs.h presence... yes
checking for infiniband/verbs.h... yes
looking for library without search path
checking for ibv_open_device in -libverbs... yes
checking number of arguments to ibv_create_cq... 5
checking whether IBV_EVENT_CLIENT_REREGISTER is declared... yes
checking for ibv_get_device_list... yes
checking for ibv_resize_cq... yes
checking for struct ibv_device.transport_type... yes
checking for ibv_create_xrc_rcv_qp... no
checking rdma/rdma_cma.h usability... yes
checking rdma/rdma_cma.h presence... yes
checking for rdma/rdma_cma.h... yes
checking for rdma_create_id in -lrdmacm... yes
checking for rdma_get_peer_addr... yes
checking for infiniband/driver.h... yes
checking if ConnectX XRC support is enabled... no
checking if OpenFabrics RDMACM support is enabled... yes
checking if OpenFabrics IBCM support is enabled... no
checki

Re: [OMPI users] Open-MPI and gprof

2009-04-23 Thread Daniel Spångberg


I have used vprof, which is free, and also works well with openmpi:
http://sourceforge.net/projects/vprof/

One might need slight code modifications to get output, depending on  
compilers used, such as adding

vmon_begin();
to start profiling and
vmon_done_task(rank);
to end profiling where rank is the MPI rank integer.

vprof can also use papi, but I have not (yet) tried this.

Daniel Spångberg


Den 2009-04-23 02:00:01 skrev Brock Palen :

There is a tool (not free)  That I have liked that works great with  
OMPI, and can use gprof information.


http://www.allinea.com/index.php?page=74

Also I am not sure but Tau (which is free)  Might support some gprof  
hooks.

http://www.cs.uoregon.edu/research/tau/home.php

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Apr 22, 2009, at 7:37 PM, jgans wrote:


Hi,

Yes you can profile MPI applications by compiling with -pg. However, by  
default each process will produce an output file called "gmon.out",  
which is a problem if all processes are writing to the same global file  
system (i.e. all processes will try to write to the same file).


There is an undocumented feature of gprof that allows you to specify  
the filename for profiling output via the environment variable  
GMON_OUT_PREFIX. For example, one can set this variable in the .bashrc  
file for every node to insure unique profile filenames, i.e.:


export GMON_OUT_PREFIX='gmon.out-'`/bin/uname -n`

The filename will appear as GMON_OUT_PREFIX.pid, where pid is the  
process id on a given node (so this will work when multiple nodes are  
contained in a single host).


Regards,

Jason

Tiago Almeida wrote:

Hi,
I've never done this, but I believe that an executable compiled with  
profilling support (-pg) will generate the gmon.out file in its  
current directory, regardless of running under MPI or not. So I think  
that you'll have a gmon.out on each node and therefore you can "gprof"  
them independently.


Best regards,
Tiago Almeida
-
jody wrote:

Hi
I wanted to profile my application using gprof, and proceeded like
when profiling a normal application:
- compile everything with option -pg
- run application
- call gprof
This returns a normal-looking output, but i don't know
whether this is the data for node 0 only or accumulated for all nodes.

Does anybody have experience in profiling parallel applications?
Is there a way to have profile data for each node separately?
If not, is there another profiling tool which can?

Thank You
  Jody
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Daniel Spångberg
Materialkemi
Uppsala Universitet

Re: [OMPI users] Open-MPI and gprof

2009-04-23 Thread Daniel Spångberg

Regarding miscompilation of vprof and bfd_get_section_size_before_reloc.  
Simply change the call from bfd_get_section_size_before_reloc to  
bdf_get_section_size in exec.cc and recompile.


Daniel Spångberg

Den 2009-04-23 10:16:07 skrev jody :


Hi all
Thanks for all the input.

I have not gotten around to try any of the tools (Sun Studio, Tau or  
vprof).

Actually, i can't compile vprof - make fails with
  exec.cc: In static member function ‘static void
BFDExecutable::find_address_in_section(bfd*, asection*, void*)’:
  exec.cc:144: error: ‘bfd_get_section_size_before_reloc’ was not
declared in this scope
Does anybody have an idea how to get around this problem?

Anyway, the GMON_OUT_PREFIX hint was very helpful - thanks, Jason!

If i  get vprof or one of the other tools running, i'll write something  
up -

perhaps the profiling subject would be worthy for a FAQ entry...

Thanks
  Jody

On Thu, Apr 23, 2009 at 9:12 AM, Daniel Spångberg   
wrote:

I have used vprof, which is free, and also works well with openmpi:
http://sourceforge.net/projects/vprof/

One might need slight code modifications to get output, depending on
compilers used, such as adding
vmon_begin();
to start profiling and
vmon_done_task(rank);
to end profiling where rank is the MPI rank integer.

vprof can also use papi, but I have not (yet) tried this.

Daniel Spångberg


Den 2009-04-23 02:00:01 skrev Brock Palen :

There is a tool (not free)  That I have liked that works great with  
OMPI,

and can use gprof information.

http://www.allinea.com/index.php?page=74

Also I am not sure but Tau (which is free)  Might support some gprof
hooks.
http://www.cs.uoregon.edu/research/tau/home.php

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Apr 22, 2009, at 7:37 PM, jgans wrote:


Hi,

Yes you can profile MPI applications by compiling with -pg. However,  
by
default each process will produce an output file called "gmon.out",  
which is
a problem if all processes are writing to the same global file system  
(i.e.

all processes will try to write to the same file).

There is an undocumented feature of gprof that allows you to specify  
the
filename for profiling output via the environment variable  
GMON_OUT_PREFIX.
For example, one can set this variable in the .bashrc file for every  
node to

insure unique profile filenames, i.e.:

export GMON_OUT_PREFIX='gmon.out-'`/bin/uname -n`

The filename will appear as GMON_OUT_PREFIX.pid, where pid is the  
process
id on a given node (so this will work when multiple nodes are  
contained in a

single host).

Regards,

Jason

Tiago Almeida wrote:


Hi,
I've never done this, but I believe that an executable compiled with
profilling support (-pg) will generate the gmon.out file in its  
current
directory, regardless of running under MPI or not. So I think that  
you'll

have a gmon.out on each node and therefore you can "gprof" them
independently.

Best regards,
Tiago Almeida
-
jody wrote:


Hi
I wanted to profile my application using gprof, and proceeded like
when profiling a normal application:
- compile everything with option -pg
- run application
- call gprof
This returns a normal-looking output, but i don't know
whether this is the data for node 0 only or accumulated for all  
nodes.


Does anybody have experience in profiling parallel applications?
Is there a way to have profile data for each node separately?
If not, is there another profiling tool which can?

Thank You
 Jody
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Daniel Spångberg
Materialkemi
Uppsala Universitet
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




--
Daniel Spångberg
Materialkemi
Uppsala Universitet

Re: [OMPI users] Building OMPI-1.0.2 on OS X v10.3.9 with IBM XLC +XLF

2006-04-10 Thread David Daniel


Perhaps this is a bug in xlc++.  Maybe this one...

http://www-1.ibm.com/support/docview.wss?uid=swg1IY78555

My (untested) guess is that removing the const_cast will allow it to  
compile, i.e. in ompi/mpi/cxx/group_inln.h replace

const_cast(ranges)
by
ranges

David


On Apr 10, 2006, at 12:17 PM, Warner Yuen wrote:

I'm running Mac OS X v 10.3.9 Panther and tried to get OpenMPI to  
compile with IBM XLC and XLF. The compilation failed, any ideas  
what might be going wrong? I used the following settings:


export CC=/opt/ibmcmp/vacpp/6.0/bin/xlc
export CXX=/opt/ibmcmp/vacpp/6.0/bin/xlc++
export CFLAGS="-O3"
export CXXFLAGS="-O3"
export FFLAGS="-O3"
./configure --with-gm=/opt/gm --prefix=/home/warner/mpi_src/ompi102

ranlib .libs/libmpi_c_mpi.a
creating libmpi_c_mpi.la
(cd .libs && rm -f libmpi_c_mpi.la && ln -s ../libmpi_c_mpi.la  
libmpi_c_mpi.la)

Making all in cxx
source='mpicxx.cc' object='mpicxx.lo' libtool=yes \
DEPDIR=.deps depmode=none /bin/sh ../../.././config/depcomp \
/bin/sh ../../../libtool --tag=CXX --mode=compile /opt/ibmcmp/vacpp/ 
6.0/bin/xlc++ -DHAVE_CONFIG_H -I. -I. -I../../../include -I../../../ 
include   -I../../../include -I../../.. -I../../.. -I../../../ 
include -I../../../opal -I../../../orte -I../../../ompi  - 
D_REENTRANT  -DNDEBUG -O3  -c -o mpicxx.lo mpicxx.cc

mkdir .libs
/opt/ibmcmp/vacpp/6.0/bin/xlc++ -DHAVE_CONFIG_H -I. -I. -I../../../ 
include -I../../../include -I../../../include -I../../.. -I../../..  
-I../../../include -I../../../opal -I../../../orte -I../../../ompi - 
D_REENTRANT -DNDEBUG -O3 -c mpicxx.cc  -qnocommon -DPIC -o .libs/ 
mpicxx.o
"../../../ompi/mpi/cxx/group_inln.h", line 100.66: 1540-0216 (S) An  
expression of type "const int [][3]" cannot be converted to type  
"int (*)[3]".
"../../../ompi/mpi/cxx/group_inln.h", line 108.66: 1540-0216 (S) An  
expression of type "const int [][3]" cannot be converted to type  
"int (*)[3]".

make[3]: *** [mpicxx.lo] Error 1
make[2]: *** [all-recursive] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1


-Thanks and have an OpenMPI day!

Warner Yuen
Apple Computer
email: wy...@apple.com
Tel: 408.718.2859
Fax: 408.715.0133


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Building 32-bit OpenMPI package for 64-bit Opteron platform

2006-04-11 Thread David Daniel

I suspect that to get this to work for bproc, then we will have to  
build mpirun as 64-bit and the library as 32-bit.  That's because a  
32-bit compiled mpirun calls functions in the 32-bit /usr/lib/ 
libbroc.so which don't appear to function when the system is booted  
64-bit.


Of course that would mean we need heterogeneous support to run on a  
single homogeneous system!  Will this work on the 1.0 branch?


An alternative worth thinking about is to bypass the library calls  
and start processes using a system() call to invoke the bpsh  
command.  This is a 64-bit executable linked with /usr/lib64/ 
libbproc.so and which successfully launches both 32- and 64-bit  
executables.


I'm currently trying to solve the same issue for LA-MPI :(

David


On Apr 10, 2006, at 9:18 AM, Brian Barrett wrote:


On Apr 10, 2006, at 11:07 AM, David Gunter wrote:


(flashc 105%) mpiexec -n 4 ./send4
[flashc.lanl.gov:09921] mca: base: component_find: unable to open: /
lib/libc.so.6: version `GLIBC_2.3.4' not found (required by /net/
scratch1/dog/flash64/openmpi/openmpi-1.0.2-32b/lib/openmpi/
mca_paffinity_linux.so) (ignored)
[flashc.lanl.gov:09921] mca: base: component_find: unable to open:
libbproc.so.4: cannot open shared object file: No such file or
directory (ignored)
[flashc.lanl.gov:09921] mca: base: component_find: unable to open:
libbproc.so.4: cannot open shared object file: No such file or
directory (ignored)
[flashc.lanl.gov:09921] mca: base: component_find: unable to open:
libbproc.so.4: cannot open shared object file: No such file or
directory (ignored)
[flashc.lanl.gov:09921] mca: base: component_find: unable to open:
libbproc.so.4: cannot open shared object file: No such file or
directory (ignored)
[flashc.lanl.gov:09921] mca: base: component_find: unable to open:
libbproc.so.4: cannot open shared object file: No such file or
directory (ignored)
mpiexec: relocation error: /net/scratch1/dog/flash64/openmpi/
openmpi-1.0.2-32b/lib/openmpi/mca_soh_bproc.so: undefined symbol:
bproc_nodelist

The problem now looks like /lib/libc.so.6 is not longer available.
Indeed, it is available on the compiler nodes but it cannot be found
on the backend nodes - whoops!


Well, that's interesting.  Is this on a bproc platform?  If so, you
might be best off configuring with either --enable-static or --
disable-dlopen.  Either one will prevent components from loading,
which doesn't seem to always work well.

Also, it looks like at least one of the components has a different
libc its linked against than the others.  This makes me think that
perhaps you have some old components from a previous build in your
tree.  You might want to completely remove your installation prefix
(or lib/openmpi in your installation prefix) and run make install  
again.


Are you no longer seeing the errors about epoll?


The other problem is that the -m32 flag didn't make it into mpicc for
some reason.


This is expected behavior.  There are going to be more and more cases
where Open MPI provides one wrapper compiler that does the right
thing whether the user passes in -m32 / -m64 (or any of the vendor
options for doing the same thing).  So it will be increasingly
impossible for us to know what to add (but as Jeff said, you can tell
configure to always add -m32 to the wrapper compilers if you want).

Brian
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

[OMPI users] mpirun crashes when compiled in 64-bit mode on Apple Mac Pro

2006-10-26 Thread Daniel Vollmer


Hi all,

I've compiled open-mpi 1.1.2 in 64bit mode (using XCode 2.4 / i686- 
apple-darwin8-gcc-4.0.1 (GCC) 4.0.1 (Apple Computer, Inc. build  
5363)) with
./configure --prefix=/usr/local/openmpi-1.1.2 --enable-debug CFLAGS=- 
m64 CXXFLAGS=-m64 OBJCFLAGS=-m64 LDLFLAGS=-m64

on an Intel Mac Pro (with Xeon 51XX processors) on Mac OS 10.4.8.
Everything builds fine and results in proper 64bit libraries and  
executables. Unfortunately, when attempting to run something as  
simple as

/usr/local/openmpi-1.1.2/bin/mpirun ls
it crashes (and hangs) with a NULL pointer dereference after outputting
[Sonnenblume.local:25036] opal_ifinit: unable to find network  
interfaces.


gdb shows the following:
Sonnenblume:~/Development/tau/openmpi-1.1.2 maven$ gdb /usr/local/ 
openmpi-1.1.2/bin/mpirun
GNU gdb 6.3.50-20050815 (Apple version gdb-563) (Wed Jul 19 05:10:58  
GMT 2006)

Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and  
you are
welcome to change it and/or distribute copies of it under certain  
conditions.

Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for  
details.
This GDB was configured as "i386-apple-darwin"...Reading symbols for  
shared libraries  done


(gdb) run ls
Starting program: /usr/local/openmpi-1.1.2/bin/mpirun ls
Reading symbols for shared libraries .+++ done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
[Sonnenblume.local:25051] opal_ifinit: unable to find network  
interfaces.

Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done
Reading symbols for shared libraries . done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x
0x in ?? ()
(gdb) bt
#0  0x in ?? ()
#1  0x00010040851c in orte_init_stage1 (infrastructure=true) at  
runtime/orte_init_stage1.c:267
#2  0x00010040c727 in orte_system_init (infrastructure=true) at  
runtime/orte_system_init.c:41
#3  0x000100407eea in orte_init (infrastructure=true) at runtime/ 
orte_init.c:48
#4  0x00010e20 in orterun (argc=2, argv=0x7fff5fbffbc0) at  
orterun.c:329
#5  0x00010cc1 in main (argc=2, argv=0x7fff5fbffbc0) at  
main.c:13



Any ideas / advice?

Thanks,
Daniel.




omni_output.tar.bz2
Description: Binary data

Re: [OMPI users] mpirun crashes when compiled in 64-bit mode on Apple Mac Pro

2006-10-26 Thread Daniel Vollmer



On 26.10.2006, at 23:12, Ralph H Castain wrote:

If you wouldn't mind, could you try it again after applying the  
attached
patch? This looks like a problem we encountered on another release  
where
something in the runtime didn't get initialized early enough. It  
only shows

up in certain circumstances, but this seems to fix it.


Thank you for the quick reply, but the patch did not help matters.  
I'm currently in the process of compiling a current gcc as I am not  
sure how far Apple's rather old gcc 4.01 derivative can be trusted.


Daniel.

[OMPI users] bproc problems

2007-04-26 Thread Daniel Gruner

Hi

I have been testing OpenMPI 1.2, and now 1.2.1, on several BProc-
based clusters, and I have found some problems/issues.  All my
clusters have standard ethernet interconnects, either 100Base/T or
Gigabit, on standard switches.

The clusters are all running Clustermatic 5 (BProc 4.x), and range
from 32-bit Athlon, to 32-bit Xeon, to 64-bit Opteron.  In all cases
the same problems occur, identically.  I attach here the results
from "ompi_info --all" and the config.log, for my latest build on
an Opteron cluster, using the Pathscale compilers.  I had exactly
the same problems when using the vanilla GNU compilers.

Now for a description of the problem:

When running an mpi code (cpi.c, from the standard mpi examples, also
attached), using the mpirun defaults (e.g. -byslot), with a single
process:

sonoma:dgruner{134}> mpirun -n 1 ./cpip
[n17:30019] odls_bproc: openpty failed, using pipes instead
Process 0 on n17
pi is approximately 3.1415926544231341, Error is 0.08333410
wall clock time = 0.000199

However, if one tries to run more than one process, this bombs:

sonoma:dgruner{134}> mpirun -n 2 ./cpip
.
.
.
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
[n21:30029] OOB: Connection to HNP lost
.
. ad infinitum

If one uses de option "-bynode", things work:

sonoma:dgruner{145}> mpirun -bynode -n 2 ./cpip
[n17:30055] odls_bproc: openpty failed, using pipes instead
Process 0 on n17
Process 1 on n21
pi is approximately 3.1415926544231318, Error is 0.0887
wall clock time = 0.010375


Note that there is always the message about "openpty failed, using pipes 
instead".

If I run more processes (on my 3-node cluster, with 2 cpus per node), the
openpty message appears repeatedly for the first node:

sonoma:dgruner{146}> mpirun -bynode -n 6 ./cpip
[n17:30061] odls_bproc: openpty failed, using pipes instead
[n17:30061] odls_bproc: openpty failed, using pipes instead
Process 0 on n17
Process 2 on n49
Process 1 on n21
Process 5 on n49
Process 3 on n17
Process 4 on n21
pi is approximately 3.1415926544231239, Error is 0.0807
wall clock time = 0.050332


Should I worry about the openpty failure?  I suspect that communications
may be slower this way.  Using the -byslot option always fails, so this
is a bug.  The same occurs for all the codes that I have tried, both simple
and complex.

Thanks for your attention to this.
Regards,
Daniel
-- 

Dr. Daniel Grunerdgru...@chem.utoronto.ca
Dept. of Chemistry   daniel.gru...@utoronto.ca
University of Torontophone:  (416)-978-8689
80 St. George Street fax:(416)-978-5325
Toronto, ON  M5S 3H6, Canada finger for PGP public key


cpi.c.gz
Description: GNU Zip compressed data


config.log.gz
Description: GNU Zip compressed data


ompiinfo.gz
Description: GNU Zip compressed data

Re: [OMPI users] Compile WRFV2.2 with OpenMPI

2007-04-27 Thread Daniel Gruner

>From Jiming's error messages, it seems that he is using 1.1 libraries
and header files, while supposedly compiling for ompi 1.2, 
therefore causing undefined stuff.  Am I wrong in this assessment?

Daniel


On Fri, Apr 27, 2007 at 08:03:34AM -0400, Jeff Squyres wrote:
> This is quite odd; we have tested OMPI 1.1.x with the intel compilers  
> quite a bit.  In particular, it seems to be complaining about  
> MPI_Fint and MPI_Comm, but these two types should have been  
> typedef'ed earlier in mpi.h.
> 
> Can you send along the information listed on the "Getting Help" page  
> on the web site, and also include your mpi.h file?
> 
> Thanks!
> 
> 
> 
> On Apr 26, 2007, at 5:28 PM, Jiming Jin wrote:
> 
> > Dear Users:
> >
> >  I have been trying to use the intel ifort and icc compilers to  
> > compile an atmospheric model called the Weather Research &  
> > Forecasting model (WRFV2.2) on a Linux Cluster (x86_64) using Open- 
> > MPI v1.2 that were also compiled with INTEL ICC.   However, I got a  
> > lot of error messages as follows when compiling WRF.
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(788):  
> > error: expected an identifier
> >   OMPI_DECLSPEC  MPI_Fint MPI_Comm_c2f(MPI_Comm comm);
> >   ^
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(802):  
> > error: "MPI_Comm" has already been declared in the current scope
> >   OMPI_DECLSPEC  MPI_Comm MPI_Comm_f2c(MPI_Fint comm);
> >   ^
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(804):  
> > error: function "MPI_Comm" is not a type name
> >   OMPI_DECLSPEC  int MPI_Comm_free(MPI_Comm *comm);
> >^
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(805):  
> > error: function "MPI_Comm" is not a type name
> >   OMPI_DECLSPEC  int MPI_Comm_get_attr(MPI_Comm comm, int comm_keyval,
> >^
> > /data/software/x86_64/open-mpi/1.1.4-intel//include/mpi.h(807):  
> > error: function "MPI_Comm" is not a type name
> >   OMPI_DECLSPEC  int MPI_Comm_get_errhandler(MPI_Comm comm,  
> > MPI_Errhandler *erhandler);
> >
> > I would highly appreciate it if someone could give me suggestions  
> > on how to fix the problem.
> >
> > Jiming
> > --
> > Jiming Jin, PhD
> > Earth Sciences Division
> > Lawrence Berkeley National Lab
> > One Cyclotron Road, Mail-Stop 90-1116
> > Berkeley, CA 94720
> > Tel: 510-486-7551
> > Fax: 510-486-5686
> >
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> Cisco Systems
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 

Dr. Daniel Grunerdgru...@chem.utoronto.ca
Dept. of Chemistry   daniel.gru...@utoronto.ca
University of Torontophone:  (416)-978-8689
80 St. George Street fax:(416)-978-5325
Toronto, ON  M5S 3H6, Canada finger for PGP public key

Re: [OMPI users] bproc problems

2007-04-27 Thread Daniel Gruner

Thanks to both you and David Gunter.  I disabled pty support and
it now works.  

There is still the issue of the mpirun default being "-byslot", which
causes all kinds of trouble.  Only by using "-bynode" do things work
properly.

Daniel

On Thu, Apr 26, 2007 at 02:28:33PM -0600, gshipman wrote:
> There is a known issue on BProc 4 w.r.t. pty support. Open MPI by  
> default will try to use ptys for I/O forwarding but will revert to  
> pipes if ptys are not available.
> 
> You can "safely" ignore the pty warnings, or you may want to rerun  
> configure and add:
> --disable-pty-support
> 
> I say "safely" because my understanding is that some I/O data may be  
> lost if pipes are used during abnormal termination.
> 
> Alternatively you might try getting pty support working, you need to  
> configure ptys on the backend nodes.
> You can then try the following code to test if it is working  
> correctly, if this fails (it does on our BProc 4 cluster) you  
> shouldn't use ptys on BProc.
> 
> 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> int
> main(int argc, char *agrv[])
> {
>int amaster, aslave;
> 
>if (openpty(&amaster, &aslave, NULL, NULL, NULL) < 0) {
>  printf("openpty() failed with errno = %d, %s\n", errno, strerror 
> (errno));
>} else {
>  printf("openpty() succeeded\n");
>}
> 
>return 0;
> }
> 
> 
> 
> 
> 
> 
> On Apr 26, 2007, at 2:06 PM, Daniel Gruner wrote:
> 
> > Hi
> >
> > I have been testing OpenMPI 1.2, and now 1.2.1, on several BProc-
> > based clusters, and I have found some problems/issues.  All my
> > clusters have standard ethernet interconnects, either 100Base/T or
> > Gigabit, on standard switches.
> >
> > The clusters are all running Clustermatic 5 (BProc 4.x), and range
> > from 32-bit Athlon, to 32-bit Xeon, to 64-bit Opteron.  In all cases
> > the same problems occur, identically.  I attach here the results
> > from "ompi_info --all" and the config.log, for my latest build on
> > an Opteron cluster, using the Pathscale compilers.  I had exactly
> > the same problems when using the vanilla GNU compilers.
> >
> > Now for a description of the problem:
> >
> > When running an mpi code (cpi.c, from the standard mpi examples, also
> > attached), using the mpirun defaults (e.g. -byslot), with a single
> > process:
> >
> > sonoma:dgruner{134}> mpirun -n 1 ./cpip
> > [n17:30019] odls_bproc: openpty failed, using pipes instead
> > Process 0 on n17
> > pi is approximately 3.1415926544231341, Error is 0.08333410
> > wall clock time = 0.000199
> >
> > However, if one tries to run more than one process, this bombs:
> >
> > sonoma:dgruner{134}> mpirun -n 2 ./cpip
> > .
> > .
> > .
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > [n21:30029] OOB: Connection to HNP lost
> > .
> > . ad infinitum
> >
> > If one uses de option "-bynode", things work:
> >
> > sonoma:dgruner{145}> mpirun -bynode -n 2 ./cpip
> > [n17:30055] odls_bproc: openpty failed, using pipes instead
> > Process 0 on n17
> > Process 1 on n21
> > pi is approximately 3.1415926544231318, Error is 0.0887
> > wall clock time = 0.010375
> >
> >
> > Note that there is always the message about "openpty failed, using  
> > pipes instead".
> >
> > If I run more processes (on my 3-node cluster, with 2 cpus per  
> > node), the
> > openpty message appears repeatedly for the first node:
> >
> > sonoma:dgruner{146}> mpirun -bynode -n 6 ./cpip
> > [n17:30061] odls_bproc: openpty failed, using pipes instead
> > [n17:30061] odls_bproc: openpty failed, using pipes instead
> > Process 0 on n17
> > Process 2 on n49
> > Process 1 on n21
> > Process 5 on n49
> > Process 3 on n17
> > Process 4 on n21
> > pi is approximately 3.1415926544231239, Error is 0.0807
> > wall clock time = 0.050332
> >
> >
> > Should I worry about the openpty failure?  I suspect that  
> > communications
> > may be slower this way.  Using the -byslot option always fails, so  
> &

[OMPI users] Compilation bug in libtool

2007-06-01 Thread Daniel Pfenniger

Hello,

version 1.2.2 refuses to compile on Mandriva 2007.1:
(more details are in the attached lg files)
...

make[2]: Entering directory `/usr/src/rpm/BUILD/openmpi-1.2.2/opal/asm'
depbase=`echo asm.lo | sed 's|[^/]*$|.deps/&|;s|\.lo$||'`; \
if /bin/sh ../../libtool --tag=CC --mode=compile gcc -DHAVE_CONFIG_H -I.
   -I. -I../../opal/include 
-I../../orte/include
-I../../ompi/include -I../../ompi  
/include
  -I../..-O3 -DNDEBUG -finline-functions -fno-strict-aliasing -pthr
 ead -MT asm.lo -MD -MP -MF "$depbase.Tpo" -c -o 
asm.lo asm.c; \
then mv -f "$depbase.Tpo" "$depbase.Plo"; else rm -f "$depbase.Tpo"; exi
  t 1; fi
../../libtool: line 813: X--tag=CC: command not found
../../libtool: line 846: libtool: ignoring unknown tag : command not found
../../libtool: line 813: X--mode=compile: command not found
../../libtool: line 979: *** Warning: inferring the mode of operation is depreca
  ted.: command not found
../../libtool: line 980: *** Future versions of Libtool will require --mode=MODE
   be specified.: command not found
../../libtool: line 1123: Xgcc: command not found
../../libtool: line 1123: X-DHAVE_CONFIG_H: command not found
../../libtool: line 1123: X-I.: command not found
../../libtool: line 1123: X-I.: command not found
../../libtool: line 1123: X-I../../opal/include: No such file or directory
../../libtool: line 1123: X-I../../orte/include: No such file or directory
../../libtool: line 1123: X-I../../ompi/include: No such file or directory
../../libtool: line 1123: X-I../../ompi/include: No such file or directory
../../libtool: line 1123: X-I../..: No such file or directory
../../libtool: line 1123: X-O3: command not found
../../libtool: line 1123: X-DNDEBUG: command not found
../../libtool: line 1123: X-finline-functions: command not found
../../libtool: line 1123: X-fno-strict-aliasing: command not found
../../libtool: line 1123: X-pthread: command not found
../../libtool: line 1123: X-MT: command not found
../../libtool: line 1123: Xasm.lo: command not found
../../libtool: line 1123: X-MD: command not found
../../libtool: line 1123: X-MP: command not found
../../libtool: line 1123: X-MF: command not found
../../libtool: line 1123: X.deps/asm.Tpo: No such file or directory
../../libtool: line 1123: X-c: command not found
../../libtool: line 1175: Xasm.lo: command not found
../../libtool: line 1180: libtool: compile: cannot determine name of library obj
  ect from `': command not found
make[2]: *** [asm.lo] Error 1
make[2]: Leaving directory `/usr/src/rpm/BUILD/openmpi-1.2.2/opal/asm'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/usr/src/rpm/BUILD/openmpi-1.2.2/opal'
make: *** [all-recursive] Error 1
[root openmpi-1.2.2]#




-- 
-

Dan





compile_bug_libtool.tgz
Description: application/tar-gz

[OMPI users] collective algorithms

2014-11-17 Thread Faraj, Daniel A

I am trying to survey the collective algorithms in Open MPI.
I looked at the src code but could not make out the guts of the communication 
algorithms.
There are some open mpi papers but not detailed, where they talk about what 
algorithms are using in certain collectives.
Has anybody done this sort of work, or point me to a paper?

Basically, for a given collective operation, what are:

a)  Communication algorithm being used for a given criteria (i.e. message 
size or np)

b)  What is theoretical algorithm cost

Thanx


---
Daniel Faraj

Re: [OMPI users] collective algorithms

2014-11-20 Thread Faraj, Daniel A

Gilles,

Thanx for the valuable information.  So this solves part of the puzzle.  The 
next thing is know the cost of these algorithms.  Some of them seem to be 
standard, however, I am afraid there could be some modifications that will 
ultimately alter the cost.  Hence I asked for a paper.
I will look around..again Thanx



---
Daniel Faraj

From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gilles Gouaillardet
Sent: Monday, November 17, 2014 10:07 PM
To: Open MPI Users
Subject: Re: [OMPI users] collective algorithms

Daniel,

you can run
$ ompi_info --parseable --all | grep _algorithm: | grep enumerator

that will give you the list of supported algo for the collectives,
here is a sample output :

mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:0:ignore
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:1:basic_linear
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:2:nonoverlapping
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:3:recursive_doubling
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:4:ring
mca:coll:tuned:param:coll_tuned_allreduce_algorithm:enumerator:value:5:segmented_ring


the decision (which algo is used based on communicator size/message size/...) 
is made in
ompi/mca/coll/tuned/coll_tuned_decision_fixed.c
and can be overriden via config file or environment variable

i cannot point you to a paper, and hopefully someone else will

Cheers,

Gilles


On 2014/11/18 12:53, Faraj, Daniel A wrote:

I am trying to survey the collective algorithms in Open MPI.

I looked at the src code but could not make out the guts of the communication 
algorithms.

There are some open mpi papers but not detailed, where they talk about what 
algorithms are using in certain collectives.

Has anybody done this sort of work, or point me to a paper?



Basically, for a given collective operation, what are:



a)  Communication algorithm being used for a given criteria (i.e. message 
size or np)



b)  What is theoretical algorithm cost



Thanx





---

Daniel Faraj








___

users mailing list

us...@open-mpi.org<mailto:us...@open-mpi.org>

Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users

Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/11/25831.php

[OMPI users] netloc

2014-12-05 Thread Faraj, Daniel A

I have installed the up to date hwloc, jasson, netloc versions on IB cluster.
I generated the lstopo xml files for a number of nodes.

When I executed:
netloc_ib_gather_raw --out-dir ib-raw --hwloc-dir hwloc  --sudo
Found 0 subnets in hwloc directory:

I searched the forum and someone had similar issue but no solution was posted.  
Any idea why we are seeing 0 subnet?
Is there something I should check for in the xml files?

---
Daniel Faraj

[OMPI users] open mpi and MLX

2014-12-09 Thread Faraj, Daniel A

I am having a trouble running simple benchmarks like osu bidirectional 
bandwidth tests with recent OMPI (> version 1.8.1)over MLX.
All versions including 1.8.1 seem to work.
The issue is that FDR will hang frequently and will complain about physical 
memory available for user run is very low.

The bug starts in v1.8.2.
I searched the src code for differences, but no luck.

I get the message below and hangs...
--
WARNING: It appears that your OpenFabrics subsystem is configured to only
allow registering part of your physical memory.  This can cause MPI jobs to
run with erratic performance, hang, and/or crash.

This may be caused by your OpenFabrics vendor limiting the amount of
physical memory that can be registered.  You should investigate the
relevant Linux kernel module parameters that control how much physical
memory can be registered, and increase them to allow registering all
physical memory on your machine.

See this Open MPI FAQ item for more information on these Linux kernel module
parameters:

http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

  Local host:  sb-cn16
  Registerable memory: 24576 MiB
  Total memory:65457 MiB

Your MPI job will continue, but may be behave poorly and/or hang.


---
Daniel Faraj

1 2 >

1 - 100 of 145 matches

Mail list logo