Re: [OMPI users] Segmentation fault with SLURM and non-local nodes

2011-02-03 Thread Samuel K. Gutierrez

Hi,

I just tried to reproduce the problem that you are experiencing and  
was unable to.


[samuel@lo1-fe ~]$ salloc -n32 mpirun --display-map ./mpi_app
salloc: Job is in held state, pending scheduler release
salloc: Pending job allocation 138319
salloc: job 138319 queued and waiting for resources
salloc: job 138319 has been allocated resources
salloc: Granted job allocation 138319

    JOB MAP   

 Data for node: Name: lob083Num procs: 16
Process OMPI jobid: [26464,1] Process rank: 0
Process OMPI jobid: [26464,1] Process rank: 1
Process OMPI jobid: [26464,1] Process rank: 2
Process OMPI jobid: [26464,1] Process rank: 3
Process OMPI jobid: [26464,1] Process rank: 4
Process OMPI jobid: [26464,1] Process rank: 5
Process OMPI jobid: [26464,1] Process rank: 6
Process OMPI jobid: [26464,1] Process rank: 7
Process OMPI jobid: [26464,1] Process rank: 8
Process OMPI jobid: [26464,1] Process rank: 9
Process OMPI jobid: [26464,1] Process rank: 10
Process OMPI jobid: [26464,1] Process rank: 11
Process OMPI jobid: [26464,1] Process rank: 12
Process OMPI jobid: [26464,1] Process rank: 13
Process OMPI jobid: [26464,1] Process rank: 14
Process OMPI jobid: [26464,1] Process rank: 15

 Data for node: Name: lob084Num procs: 16
Process OMPI jobid: [26464,1] Process rank: 16
Process OMPI jobid: [26464,1] Process rank: 17
Process OMPI jobid: [26464,1] Process rank: 18
Process OMPI jobid: [26464,1] Process rank: 19
Process OMPI jobid: [26464,1] Process rank: 20
Process OMPI jobid: [26464,1] Process rank: 21
Process OMPI jobid: [26464,1] Process rank: 22
Process OMPI jobid: [26464,1] Process rank: 23
Process OMPI jobid: [26464,1] Process rank: 24
Process OMPI jobid: [26464,1] Process rank: 25
Process OMPI jobid: [26464,1] Process rank: 26
Process OMPI jobid: [26464,1] Process rank: 27
Process OMPI jobid: [26464,1] Process rank: 28
Process OMPI jobid: [26464,1] Process rank: 29
Process OMPI jobid: [26464,1] Process rank: 30
Process OMPI jobid: [26464,1] Process rank: 31


SLURM 2.1.15
Open MPI 1.4.3 configured with: --with-platform=./contrib/platform/ 
lanl/tlcc/debug-nopanasas


I'll dig a bit further.

Sam

On Feb 2, 2011, at 9:53 AM, Samuel K. Gutierrez wrote:


Hi,

We'll try to reproduce the problem.

Thanks,

--
Samuel K. Gutierrez
Los Alamos National Laboratory


On Feb 2, 2011, at 2:55 AM, Michael Curtis wrote:



On 28/01/2011, at 8:16 PM, Michael Curtis wrote:



On 27/01/2011, at 4:51 PM, Michael Curtis wrote:

Some more debugging information:
Is anyone able to help with this problem?  As far as I can tell  
it's a stock-standard recently installed SLURM installation.


I can try 1.5.1 but hesitant to deploy this as it would require a  
recompile of some rather large pieces of software.  Should I re- 
post to the -devel lists?


Regards,


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Serial Rapid IO plug in ?

2011-02-03 Thread Mohamed Husain A.K
thanks jeff,
I shall ping you on devel list.

Mohamed Husain
On Thu, Feb 3, 2011 at 6:24 PM, Jeff Squyres  wrote:

> On Feb 3, 2011, at 12:39 AM, Mohamed Husain A.K wrote:
>
> > Is there any plug in support for byte transfer for SRIO
>
> Not to my knowledge.
>
> > if not how to go about with the developement of a plug-in
> > The SRIO interface has got a ethernet  like encapsulation.
>
> I know very little about SRIO (i.e., I skimmed
> http://en.wikipedia.org/wiki/RapidIO :-) ).  I read that article to mean
> that SRIO is used within a single compute server.  If that's true, is it
> better to use SRIO directly (vs. shared memory-based communication)?  If I'm
> incorrect and you're using SRIO like a network fabric, then it might be
> useful to use SRIO for point-to-point MPI communications and see how it
> does.
>
> I have no idea what the API is for SRIO, but I'm guessing it would be
> suitable as a Byte Transfer Layer (BTL) plugin for Open MPI.  BTLs are the
> back-end implementations behind the "ob1" plugin for the Point-to-Point
> Messaging Layer (PML) in Open MPI -- i.e., the back-end behind the MPI
> semantics for MPI_SEND, MPI_RECV, etc.
>
> There's a few different options for developing new plugins in OMPI -- if
> I've hit anywhere close to the mark on the above paragraphs, ping us over on
> the devel list and we can get you started (i.e., developing new plugins is a
> better topic for the devel list than the general users' list; see
> http://www.open-mpi.org/community/lists/ompi.php).
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] How closely tied is a specific release of OpenMPI to the host operating system and other system software?

2011-02-03 Thread Gus Correa

Jeffrey A Cummings wrote:
Thanks for all the good replies on this thread.  I don't know if I'll be 
able to make a dent in the corporate IT bureaucracy but I'm going to try.
 




From:Prentice Bisbal 
To:Open MPI Users 
Date:02/02/2011 11:35 AM
Subject:Re: [OMPI users] How closely tied is a specific release 
of OpenMPI to the host operating system and other system software?

Sent by:users-boun...@open-mpi.org




Jeffrey A Cummings wrote:
 > I use OpenMPI on a variety of platforms:  stand-alone servers running
 > Solaris on sparc boxes and Linux (mostly CentOS) on AMD/Intel boxes,
 > also Linux (again CentOS) on large clusters of AMD/Intel boxes.  These
 > platforms all have some version of the 1.3 OpenMPI stream.  I recently
 > requested an upgrade on all systems to 1.4.3 (for production work) and
 > 1.5.1 (for experimentation).  I'm getting a lot of push back from the
 > SysAdmin folks claiming that OpenMPI is closely intertwined with the
 > specific version of the operating system and/or other system software
 > (i.e., Rocks on the clusters).  I need to know if they are telling me
 > the truth or if they're just making excuses to avoid the work.  To state
 > my question another way:  Apparently each release of Linux and/or Rocks
 > comes with some version of OpenMPI bundled in.  Is it dangerous in some
 > way to upgrade to a newer version of OpenMPI?  Thanks in advance for any
 > insight anyone can provide.
 >
 > - Jeff
 >

Jeff,

OpenMPI is more or less a user-space program, and isn't that tightly
coupled to the OS at all. As long as the OS has the correct network
drivers (ethernet, IB, or other), that's all OpenMPI needs to do it's
job. In fact, you can install it yourself in your own home directory (if
your home directory is shared amongst the cluster nodes you want to
use), and run it from there - no special privileges needed.

I have many different versions of OpenMPI installed on my systems,
without a problem.

As a system administrator responsible for maintaining OpenMPI on several
clusters, it sounds like one of two things:

1. Your system administrators really don't know what they're talking
about, or,

2. They're lying to you to avoid doing work.

--
Prentice


Jeff

Worst scenario, you can install OpenMPI yourself, from the source 
tarball, in a subdirectory of your ${HOME}, for instance.

You would just need adjust your PATH and LD_LIBRARY_PATH.

Gus Correa


Re: [OMPI users] OpenMPI version syntax?

2011-02-03 Thread Gus Correa

Jeffrey A Cummings wrote:
The context was wrt the OpenMPI version that is bundled with a specific 
version of CentOS Linux which my IT folks are about to install on one of 
our servers.  Since the most recent 1.4 stream version is 1.4.3, I'm 
afraid that 1.4-4 is really some variant of 1.4 (i.e., 1.4.0) and hence 
not that new.





From:Jeff Squyres 
To:Open MPI Users 
Date:02/02/2011 07:38 PM
Subject:Re: [OMPI users] OpenMPI version syntax?
Sent by:users-boun...@open-mpi.org




On Feb 2, 2011, at 1:44 PM, Jeffrey A Cummings wrote:

 > I've encountered a supposed OpenMPI version of 1.4-4.  Is the hyphen 
a typo or is this syntax correct and if so what does it mean?


Is this an RPM version number?  It's fairly common for RPMs to add "-X" 
at the end of the version number.  The "X" indicates the RPM version 
number (i.e., the version number of the packaging -- not the package 
itself).


Open MPI's version number scheme is explained here:

   http://www.open-mpi.org/software/ompi/versions/

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Jeff (Cummings)

 ompi_info | grep 'Open MPI' should tell,

(*If* you know to which version your ompi_info is pointing to!
Otherwise, use full path.)

Here is yet another reason for installing OpenMPI
from the source tarball
(you'll be 100% sure it is 1.4.3), and put it in a non-system
directory of choice, such as ${HOME}/openmpi/1.4.3.

My two cents (which were not asked for).
Gus Correa


Re: [OMPI users] OpenMPI version syntax?

2011-02-03 Thread Prentice Bisbal
rpm -qi  might give you more detailed information.

If not, as a last resort, you can download and installed the SRPM and
then look at the name of the tarball in /usr/src/redhat/SOURCES.

Prentice

Jeffrey A Cummings wrote:
> The context was wrt the OpenMPI version that is bundled with a specific
> version of CentOS Linux which my IT folks are about to install on one of
> our servers.  Since the most recent 1.4 stream version is 1.4.3, I'm
> afraid that 1.4-4 is really some variant of 1.4 (i.e., 1.4.0) and hence
> not that new.
> 
> 
> 
> 
> From:Jeff Squyres 
> To:Open MPI Users 
> Date:02/02/2011 07:38 PM
> Subject:Re: [OMPI users] OpenMPI version syntax?
> Sent by:users-boun...@open-mpi.org
> 
> 
> 
> 
> On Feb 2, 2011, at 1:44 PM, Jeffrey A Cummings wrote:
> 
>> I've encountered a supposed OpenMPI version of 1.4-4.  Is the hyphen a
> typo or is this syntax correct and if so what does it mean?
> 
> Is this an RPM version number?  It's fairly common for RPMs to add "-X"
> at the end of the version number.  The "X" indicates the RPM version
> number (i.e., the version number of the packaging -- not the package
> itself).
> 
> Open MPI's version number scheme is explained here:
> 
>http://www.open-mpi.org/software/ompi/versions/
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 


Re: [OMPI users] Windows release 1.5.1

2011-02-03 Thread Shiqing Fan

Hi Andy,

I see the problem, your version is not up-to-date, as it is shown 
"16/12/2010" in your jpeg. You can go to Open MPI website, and download 
the latest one, which was out on 1st, Feb.



Regards,
Shiqing

On 2/3/2011 2:28 PM, Page, Andy (UK) wrote:

Shiqing
Thanks for reply.
Maybe i was niot that clear in what i meant.
I was looking for a library mpi_f77.lib and mpif.h neither of which i 
could see in the install direction see attached jpeg



*From:* Shiqing Fan [mailto:f...@hlrs.de]
*Sent:* 03 February 2011 13:20
*To:* Open MPI Users
*Cc:* Page, Andy (UK)
*Subject:* Re: [OMPI users] Windows release 1.5.1

*** WARNING ***

  This message has originated outside your organisation,
  either from an external partner or the Global Internet.
  Keep this in mind if you answer this message.
Hi,

I'm not sure if I got it correct. The libraries provided in the 
installers are MPI libraries with Fortran bindings, but they are not 
the Fortran libraries.


If you want to compile Open MPI, you need to install a Fortran 
compiler, for example, Intel Fortran Compiler. And in the CMake GUI, 
you'll be able to enable building MPI Fortran bindings.



Regards,
Shiqing


On 2/3/2011 11:53 AM, Page, Andy (UK) wrote:

Dear Users,
I am looking to compile on windows against openmpi in particular the 
Fortran libraries,
I looked at the *MS Release notes for 1.5.1* and it gave the 
*impression Fortran libraries were included but* after installing *i 
dont see them !!*

Can you advise what i am doing wrong.
Can i download just the Fortran side of openmpi and compile it up ( 
using hopefully a vsiual studio project file) ?

Or should a download the full release and try compiling it up ?
Cheers.
BAE Systems (Operations) Limited
Registered Office: Warwick House, PO Box 87, Farnborough Aerospace 
Centre, Farnborough, Hants, GU14 6YU, UK

Registered in England & Wales No: 1996687


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender.
You should not copy it or use it for any purpose nor disclose or
distribute its contents to any other person.



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] Windows release 1.5.1

2011-02-03 Thread Shiqing Fan

Hi,

I'm not sure if I got it correct. The libraries provided in the 
installers are MPI libraries with Fortran bindings, but they are not the 
Fortran libraries.


If you want to compile Open MPI, you need to install a Fortran compiler, 
for example, Intel Fortran Compiler. And in the CMake GUI, you'll be 
able to enable building MPI Fortran bindings.



Regards,
Shiqing


On 2/3/2011 11:53 AM, Page, Andy (UK) wrote:

Dear Users,
I am looking to compile on windows against openmpi in particular the 
Fortran libraries,
I looked at the *MS Release notes for 1.5.1* and it gave the 
*impression Fortran libraries were included but* after installing *i 
dont see them !!*

Can you advise what i am doing wrong.
Can i download just the Fortran side of openmpi and compile it up ( 
using hopefully a vsiual studio project file) ?

Or should a download the full release and try compiling it up ?
Cheers.
BAE Systems (Operations) Limited
Registered Office: Warwick House, PO Box 87, Farnborough Aerospace 
Centre, Farnborough, Hants, GU14 6YU, UK

Registered in England & Wales No: 1996687


This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender.
You should not copy it or use it for any purpose nor disclose or
distribute its contents to any other person.



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Serial Rapid IO plug in ?

2011-02-03 Thread Jeff Squyres
On Feb 3, 2011, at 12:39 AM, Mohamed Husain A.K wrote:

> Is there any plug in support for byte transfer for SRIO

Not to my knowledge.

> if not how to go about with the developement of a plug-in
> The SRIO interface has got a ethernet  like encapsulation.

I know very little about SRIO (i.e., I skimmed 
http://en.wikipedia.org/wiki/RapidIO :-) ).  I read that article to mean that 
SRIO is used within a single compute server.  If that's true, is it better to 
use SRIO directly (vs. shared memory-based communication)?  If I'm incorrect 
and you're using SRIO like a network fabric, then it might be useful to use 
SRIO for point-to-point MPI communications and see how it does.

I have no idea what the API is for SRIO, but I'm guessing it would be suitable 
as a Byte Transfer Layer (BTL) plugin for Open MPI.  BTLs are the back-end 
implementations behind the "ob1" plugin for the Point-to-Point Messaging Layer 
(PML) in Open MPI -- i.e., the back-end behind the MPI semantics for MPI_SEND, 
MPI_RECV, etc.

There's a few different options for developing new plugins in OMPI -- if I've 
hit anywhere close to the mark on the above paragraphs, ping us over on the 
devel list and we can get you started (i.e., developing new plugins is a better 
topic for the devel list than the general users' list; see 
http://www.open-mpi.org/community/lists/ompi.php).

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] Windows release 1.5.1

2011-02-03 Thread Page, Andy (UK)
Dear Users,

I am looking to compile on windows against openmpi in particular the
Fortran libraries,
I looked at the MS Release notes for 1.5.1 and it gave the impression
Fortran libraries were included but after installing i dont see them !!
Can you advise what i am doing wrong.
Can i download just the Fortran side of openmpi and compile it up (
using hopefully a vsiual studio project file) ?
Or should a download the full release and try compiling it up ?

Cheers.

BAE Systems (Operations) Limited
Registered Office: Warwick House, PO Box 87, Farnborough Aerospace
Centre, Farnborough, Hants, GU14 6YU, UK
Registered in England & Wales No: 1996687 



This email and any attachments are confidential to the intended
recipient and may also be privileged. If you are not the intended
recipient please delete it from your system and notify the sender.
You should not copy it or use it for any purpose nor disclose or
distribute its contents to any other person.




Re: [OMPI users] Calculate time spent on non blocking communication?

2011-02-03 Thread Eugene Loh




Okay, so forget about Peruse.

You can basically figure that your user process will either be inside
an MPI call or else not.  If it's inside an MPI call, then that's time
spent in communications (and notably in the synchronization that's
implicit to communication).  If it's not inside an MPI call, then
that's time spent in computation.  Basically, no time in this model is
attributed to both communication and computation at once.

There is an OMPI FAQ on performance tools. 
http://www.open-mpi.org/faq/?category=perftools  Perhaps something
there will be helpful for you.  Specifically, the "Sun Studio
Performance Analyzer" allows you to divide that "communication" time
between "data transfer time" and "synchronization time".  But a basic
classification as either communication or else computation is pretty
central to all the tools.

Bibrak Qamar wrote:

  As asked the reason of such calculation of non
blocking communication, the main reason is that I want to look into the
program as how much it percent time is consumed on communication alone,
computation alone and the intersection of both.
  
  On Thu, Feb 3, 2011 at 5:08 AM, Eugene Loh 
wrote:
  Again,
you can try the Peruse instrumentation.  Configure OMPI with
--enable-peruse.  The instrumentation points might help you decide how
you want to define the time you want to measure.  Again, you really
have to spend a bunch of your own time deciding what is meaningful to
measure.

Gustavo Correa wrote:


  However, OpenMPI may give this info, with non-MPI
(hence non-portable) functions, I'd guess.
  
  
  
From: Eugene Loh 


Anyhow, the Peruse instrumentation in OMPI
might help.

  

  
  
  





Re: [OMPI users] Calculate time spent on non blocking communication?

2011-02-03 Thread Bibrak Qamar
Thanks all,

As asked the reason of such calculation of non blocking communication, the
main reason is that I want to look into the program as how much it percent
time is consumed on communication alone, computation alone and the
intersection of both.

Bibrak Qamar
Undergraduate Student BIT-9
Member Center for High Performance Scientific Computing
NUST-School of Electrical Engineering and Computer Science.


On Thu, Feb 3, 2011 at 5:08 AM, Eugene Loh  wrote:

> Again, you can try the Peruse instrumentation.  Configure OMPI with
> --enable-peruse.  The instrumentation points might help you decide how you
> want to define the time you want to measure.  Again, you really have to
> spend a bunch of your own time deciding what is meaningful to measure.
>
> Gustavo Correa wrote:
>
>  However, OpenMPI may give this info, with non-MPI (hence non-portable)
>> functions, I'd guess.
>>
>>  From: Eugene Loh 
>>>
>>> Anyhow, the Peruse instrumentation in OMPI might help.
>>>
>>>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


[OMPI users] Serial Rapid IO plug in ?

2011-02-03 Thread Mohamed Husain A.K
Is there any plug in support for byte transfer for SRIO
if not how to go about with the developement of a plug-in
The SRIO interface has got a ethernet  like encapsulation.