Re: [OMPI users] running openmpi in debug/verbose mode

2012-10-26 Thread Mahmood Naderan

>You can usually resolve that by configuring with --disable-dlopen
Ok I will try.
So what is the purpose of enabling dlopen? Why dlopen is not disabled by 
default.
I mean why high traffic configuration is enabled by default?


 
Regards,
Mahmood




 From: Ralph Castain 
To: Mahmood Naderan  
Sent: Thursday, October 25, 2012 8:55 PM
Subject: Re: [OMPI users] running openmpi in debug/verbose mode
 

Sorry - we're all a tad busy with deadlines for the Supercomputing conference 
:-(

You are probably running into trouble due to dlopen pulling files across the 
network. You can usually resolve that by configuring with --disable-dlopen.



On Oct 25, 2012, at 11:51 AM, Mahmood Naderan  wrote:

I sent a problem to the list but didn't receive any reply. In short, we found 
>
>that when we run openmpi+openfoam program on a node (in a diskless cluster), 
>
>there is a huge IO operations caused by openmpi. When we run openmpi+openfoam
>on the server, there is no problem. When we run openfoam directly on the node,
>there is also no problem.
>
>
>Now I am looking for some verbose/debug outputs from openmpi which 
>
>shows the activity of it (in particular IO messages for example opening file1
>
>or closing file2...).
>
>
>Can I extract such messages?
>
> 
>Regards,
>Mahmood
>
>
>
>
>
> From: Ralph Castain 
>To: Mahmood Naderan ; Open MPI Users 
> 
>Sent: Thursday, October 25, 2012 8:44 PM
>Subject: Re: [OMPI users] running openmpi in debug/verbose mode
> 
>
>There is a *ton* of debug output available - would help to know what you are 
>attempting to debug
>
>
>
>
>On Oct 25, 2012, at 11:38 AM, Mahmood Naderan  wrote:
>
>
>>
>>Dear all,
>>Is there any way to run openmpi in debug or verbose mode? Is there any log 
>>for openmpi run?
>> 
>>Regards,
>>Mahmood
>>___
>>users mailing list
>>us...@open-mpi.org
>>http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>

[OMPI users] OpenMPI on Windows when MPI_F77 is used from a C application

2012-10-26 Thread Mathieu Gontier
Dear all,

I am willing to use OpenMPI on Windows for a CFD instead of  MPICH2. My
solver is developed if Fortran77 and piloted by a C++ interface; the both
levels call MPI functions.

So, I installed OpenMPI-1.6.2-x64 on my system and compiled my code
successfully. But, at the runtime it crashed.
I reproduced the problem into a small C application calling a Fortran
function using MPI_Allreduce; when I removed some aggressive optimization
options from the Fortran, it worked:
*

   -

   Optimization: Disable (/Od)
   -

   Inline Function Expansion: Any Suitable (/Ob2)
   -

   Favor Size or Speed: Favor Fast Code (/Ot)

*

So, I removed the same options from the Fortran parts of my solver, but it
still crashes. I tried some others, but it still continues crashing. Does
anybody has an idea? Should I (de)activate some compilation options? Is
there some properties to build and link against libmpi_f77.lib?

Thanks for your help.
Mathieu.

-- 
Mathieu Gontier
- MSN: mathieu.gont...@gmail.com
- Skype: mathieu_gontier


Re: [OMPI users] running openmpi in debug/verbose mode

2012-10-26 Thread Jeff Squyres
Open MPI doesn't really do much file IO at all.  We do a little during startup 
/ shutdown, but during the majority of the MPI application run, there's 
little/no file IO from the MPI layer.

Note that the above statements assume that you are not using the MPI IO 
function calls.  If your application is using MPI IO, then of course, Open MPI 
will do lots of file IO.

My point: if your application isn't doing MPI IO, then your file IO is likely 
coming from somewhere other than Open MPI.


On Oct 25, 2012, at 2:51 PM, Mahmood Naderan wrote:

> I sent a problem to the list but didn't receive any reply. In short, we found 
> that when we run openmpi+openfoam program on a node (in a diskless cluster), 
> there is a huge IO operations caused by openmpi. When we run openmpi+openfoam
> on the server, there is no problem. When we run openfoam directly on the node,
> there is also no problem.
> 
> Now I am looking for some verbose/debug outputs from openmpi which 
> shows the activity of it (in particular IO messages for example opening file1
> or closing file2...).
> 
> Can I extract such messages?
>  
> Regards,
> Mahmood
> 
> From: Ralph Castain 
> To: Mahmood Naderan ; Open MPI Users 
>  
> Sent: Thursday, October 25, 2012 8:44 PM
> Subject: Re: [OMPI users] running openmpi in debug/verbose mode
> 
> There is a *ton* of debug output available - would help to know what you are 
> attempting to debug
> 
> 
> On Oct 25, 2012, at 11:38 AM, Mahmood Naderan  wrote:
> 
>> 
>> Dear all,
>> Is there any way to run openmpi in debug or verbose mode? Is there any log 
>> for openmpi run?
>>  
>> Regards,
>> Mahmood
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] ompi-clean on single executable

2012-10-26 Thread Ralph Castain

On Oct 26, 2012, at 4:14 AM, Nicolas Deladerriere 
 wrote:

> Thanks all for your comments
> 
> Ralph
> 
> What I was initially looking at is a tool (or option of orte-clean) that 
> clean up the mess you are talking about, but only the mess that have been 
> created by a single mpirun command. As far I have understood, orte-clean 
> clean all mess on a node associated to all open-mpi process that have run (or 
> are currently running).

That is correct. We could fairly easily modify it to cleanup leftover files 
from a single mpirun without affecting others. Unfortunately, there really 
isn't any easy way to tell what processes belong to a specific mpirun, so 
selectively killing zombies would be very hard to do

> 
> According to Rolph comment, usually, mpirun command does not leave any zombie 
> processes, Hence it seems that the effect of orte-clean is limited. But, 
> since it exists, I was wondering that it is doing usefull stuff ?

It was created during the early years of our work when zombies were frequently 
occurring. The need for it has declined over the years, but we keep it around 
because we do still hit problems on occasion - especially during development.

> 
> Cheers,
> Nicolas
> 
> 2012/10/25 Ralph Castain 
> Okay, now I'm confused. If all you want to do is cleanly "kill" a running 
> OMPI job, then why not just issue
> 
> $ kill SIGTERM 
> 
> This will cause mpirun to order the clean termination of all remote procs 
> within that execution, and then cleanly terminate itself. No tool we create 
> could do it any better.
> 
> Is there an issue with doing so?
> 
> orte-clean was intended to cleanup the mess if/when the above method doesn't 
> work - i.e., when you have to "kill SIGKILL mpirun", which forcibly kills 
> mpirun but might leave zombie orteds on the remote nodes.
> 
> 
> On Oct 24, 2012, at 10:39 AM, Jeff Squyres  wrote:
> 
> > Or perhaps cloned, renamed to orte-kill, and modified to kill a single (or 
> > multiple) specific job(s).  That would be POSIX-like ("kill" vs. "clean").
> >
> >
> > On Oct 24, 2012, at 1:32 PM, Rolf vandeVaart wrote:
> >
> >> And just to give a little context, ompi-clean was created initially to 
> >> "clean" up a node, not for cleaning up a specific job.  It was for the 
> >> case where MPI jobs would leave some files behind or leave some processes 
> >> running.  (I do not believe this happens much at all anymore.)  But, as 
> >> was said, no reason it could not be modified.
> >>
> >>> -Original Message-
> >>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
> >>> On Behalf Of Jeff Squyres
> >>> Sent: Wednesday, October 24, 2012 12:56 PM
> >>> To: Open MPI Users
> >>> Subject: Re: [OMPI users] ompi-clean on single executable
> >>>
> >>> ...but patches would be greatly appreciated.  :-)
> >>>
> >>> On Oct 24, 2012, at 12:24 PM, Ralph Castain wrote:
> >>>
>  All things are possible, including what you describe. Not sure when we
> >>> would get to it, though.
> 
> 
>  On Oct 24, 2012, at 4:01 AM, Nicolas Deladerriere
> >>>  wrote:
> 
> > Reuti,
> >
> > The problem I am facing is a small small part of our production
> > system, and I cannot modify our mpirun submission system. This is why
> > i am looking at solution using only ompi-clean of mpirun command
> > specification.
> >
> > Thanks,
> > Nicolas
> >
> > 2012/10/24, Reuti :
> >> Am 24.10.2012 um 11:33 schrieb Nicolas Deladerriere:
> >>
> >>> Reuti,
> >>>
> >>> Thanks for your comments,
> >>>
> >>> In our case, we are currently running different mpirun commands on
> >>> clusters sharing the same frontend. Basically we use a wrapper to
> >>> run the mpirun command and to run an ompi-clean command to clean
> >>> up
> >>> the mpi job if required.
> >>> Using ompi-clean like this just kills all other mpi jobs running on
> >>> same frontend. I cannot use queuing system
> >>
> >> Why? Using it on a single machine was only one possible setup. Its
> >> purpose is to distribute jobs to slave hosts. If you have already
> >> one frontend as login-machine it fits perfect: the qmaster (in case
> >> of SGE) can run there and the execd on the nodes.
> >>
> >> -- Reuti
> >>
> >>
> >>> as you have suggested this
> >>> is why I was wondering a option or other solution associated to
> >>> ompi-clean command to avoid this general mpi jobs cleaning.
> >>>
> >>> Cheers
> >>> Nicolas
> >>>
> >>> 2012/10/24, Reuti :
>  Hi,
> 
>  Am 24.10.2012 um 09:36 schrieb Nicolas Deladerriere:
> 
> > I am having issue running ompi-clean which clean up (this is
> > normal) session associated to a user which means it kills all
> > running jobs assoicated to this session (this is also normal).
> > But I would like to be able to clean up session associated to a
> >

Re: [OMPI users] System CPU of openmpi-1.7rc1

2012-10-26 Thread tmishima


Hi Ralph, thank you for your comment.

I understand what you mean. As you pointed out, I have one process sleep
before finalize. Then, mumps finalize might affect the behavior.

I will remove mumps finalize (and/or initialize) function from my testing
program ant try again on next Monday to make my point clear.

Regards, tmishima

> I'm not sure - just fishing for possible answers. When we see high cpu
usage, it usually occurs during MPI communications - when a process is
waiting for a message to arrive, it polls at a high rate
> to keep the latency as low as possible. Since you have one process
"sleep" before calling the finalize sequence, it could be that the other
process is getting held up on a receive and thus eating the
> cpu.
>
> There really isn't anything special going on during Init/Finalize, and
OMPI itself doesn't have any MPI communications in there. I'm not familiar
with MUMPS, but if MUMPS finalize is doing something
> like an MPI_Barrier to ensure the procs finalize together, then that
would explain what you see. The docs I could find imply there is some MPI
embedded in MUMPS, but I couldn't find anything specific
> about finalize.
>
>
> On Oct 25, 2012, at 6:43 PM, tmish...@jcity.maeda.co.jp wrote:
>
> >
> >
> > Hi Ralph,
> >
> > do you really mean "MUMPS finalize"? I don't think it has much relation
> > with
> > this behavior?
> >
> > Anyway, I'm just a mumps user. I have to ask mumps developers about
what
> > MUMPS
> > initailize and finalize does.
> >
> > Regartds,
> > tmishima
> >
> >> Out of curiosity, what does MUMPS finalize do? Does it send a message
or
> > do a barrier operation?
> >>
> >>
> >> On Oct 25, 2012, at 5:53 PM, tmish...@jcity.maeda.co.jp wrote:
> >>
> >>>
> >>>
> >>> Hi,
> >>>
> >>> I find that system CPU time of openmpi-1.7rc1 is quite different with
> >>> that of openmpi-1.6.2 as shown in the attached ganglia display.
> >>>
> >>> About 2 years ago, I reported a similar behavior of openmpi-1.4.3.
> >>> The testing method is what I used at that time.
> >>> (please see my post entitled "SYSTEM CPU with OpenMPI 1.4.3")
> >>>
> >>> Is this due to a pre-released version's check routine or does
> >>> something go wrong?
> >>>
> >>> Best regards,
> >>> Tetsuya Mishima
> >>>
> >>> --
> >>> Testing program:
> >>> INCLUDE 'mpif.h'
> >>> INCLUDE 'dmumps_struc.h'
> >>> TYPE (DMUMPS_STRUC) MUMPS_PAR
> >>> c
> >>> MUMPS_PAR%COMM = MPI_COMM_WORLD
> >>> MUMPS_PAR%SYM = 1
> >>> MUMPS_PAR%PAR = 1
> >>> MUMPS_PAR%JOB = -1 ! INITIALIZE MUMPS
> >>> CALL MPI_INIT(IERR)
> >>> CALL DMUMPS(MUMPS_PAR)
> >>> c
> >>> CALL MPI_COMM_RANK( MPI_COMM_WORLD, MYID, IERR )
> >>> IF ( MYID .EQ. 0 ) CALL SLEEP(180) ! WAIT 180 SEC.
> >>> c
> >>> MUMPS_PAR%JOB = -2 ! FINALIZE MUMPS
> >>> CALL DMUMPS(MUMPS_PAR)
> >>> CALL MPI_FINALIZE(IERR)
> >>> c
> >>> END
> >>> ( This does nothing but just calls intialize & finalize
> >>> routine of MUMPS & MPI)
> >>>
> >>> command line : mpirun -host node03 -np 16 ./testrun
> >>>
> >>> (See attached file:
> >
openmpi17rc1-cmp.bmp)___

> >
> >>> users mailing list
> >>> us...@open-mpi.org
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >>
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> >>
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] ompi-clean on single executable

2012-10-26 Thread Nicolas Deladerriere
Thanks all for your comments

Ralph

What I was initially looking at is a tool (or option of orte-clean) that
clean up the mess you are talking about, but only the mess that have been
created by a single mpirun command. As far I have understood, orte-clean
clean all mess on a node associated to all open-mpi process that have run
(or are currently running).

According to Rolph comment, usually, mpirun command does not leave any
zombie processes, Hence it seems that the effect of orte-clean is limited.
But, since it exists, I was wondering that it is doing usefull stuff ?

Cheers,
Nicolas

2012/10/25 Ralph Castain 

> Okay, now I'm confused. If all you want to do is cleanly "kill" a running
> OMPI job, then why not just issue
>
> $ kill SIGTERM 
>
> This will cause mpirun to order the clean termination of all remote procs
> within that execution, and then cleanly terminate itself. No tool we create
> could do it any better.
>
> Is there an issue with doing so?
>
> orte-clean was intended to cleanup the mess if/when the above method
> doesn't work - i.e., when you have to "kill SIGKILL mpirun", which forcibly
> kills mpirun but might leave zombie orteds on the remote nodes.
>
>
> On Oct 24, 2012, at 10:39 AM, Jeff Squyres  wrote:
>
> > Or perhaps cloned, renamed to orte-kill, and modified to kill a single
> (or multiple) specific job(s).  That would be POSIX-like ("kill" vs.
> "clean").
> >
> >
> > On Oct 24, 2012, at 1:32 PM, Rolf vandeVaart wrote:
> >
> >> And just to give a little context, ompi-clean was created initially to
> "clean" up a node, not for cleaning up a specific job.  It was for the case
> where MPI jobs would leave some files behind or leave some processes
> running.  (I do not believe this happens much at all anymore.)  But, as was
> said, no reason it could not be modified.
> >>
> >>> -Original Message-
> >>> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org]
> >>> On Behalf Of Jeff Squyres
> >>> Sent: Wednesday, October 24, 2012 12:56 PM
> >>> To: Open MPI Users
> >>> Subject: Re: [OMPI users] ompi-clean on single executable
> >>>
> >>> ...but patches would be greatly appreciated.  :-)
> >>>
> >>> On Oct 24, 2012, at 12:24 PM, Ralph Castain wrote:
> >>>
>  All things are possible, including what you describe. Not sure when we
> >>> would get to it, though.
> 
> 
>  On Oct 24, 2012, at 4:01 AM, Nicolas Deladerriere
> >>>  wrote:
> 
> > Reuti,
> >
> > The problem I am facing is a small small part of our production
> > system, and I cannot modify our mpirun submission system. This is why
> > i am looking at solution using only ompi-clean of mpirun command
> > specification.
> >
> > Thanks,
> > Nicolas
> >
> > 2012/10/24, Reuti :
> >> Am 24.10.2012 um 11:33 schrieb Nicolas Deladerriere:
> >>
> >>> Reuti,
> >>>
> >>> Thanks for your comments,
> >>>
> >>> In our case, we are currently running different mpirun commands on
> >>> clusters sharing the same frontend. Basically we use a wrapper to
> >>> run the mpirun command and to run an ompi-clean command to clean
> >>> up
> >>> the mpi job if required.
> >>> Using ompi-clean like this just kills all other mpi jobs running on
> >>> same frontend. I cannot use queuing system
> >>
> >> Why? Using it on a single machine was only one possible setup. Its
> >> purpose is to distribute jobs to slave hosts. If you have already
> >> one frontend as login-machine it fits perfect: the qmaster (in case
> >> of SGE) can run there and the execd on the nodes.
> >>
> >> -- Reuti
> >>
> >>
> >>> as you have suggested this
> >>> is why I was wondering a option or other solution associated to
> >>> ompi-clean command to avoid this general mpi jobs cleaning.
> >>>
> >>> Cheers
> >>> Nicolas
> >>>
> >>> 2012/10/24, Reuti :
>  Hi,
> 
>  Am 24.10.2012 um 09:36 schrieb Nicolas Deladerriere:
> 
> > I am having issue running ompi-clean which clean up (this is
> > normal) session associated to a user which means it kills all
> > running jobs assoicated to this session (this is also normal).
> > But I would like to be able to clean up session associated to a
> > job (a not user).
> >
> > Here is my point:
> >
> > I am running two executable :
> >
> > % mpirun -np 2 myexec1
> >   --> run with PID 2399 ...
> > % mpirun -np 2 myexec2
> >   --> run with PID 2402 ...
> >
> > When I run orte-clean I got this result :
> > % orte-clean -v
> > orte-clean: cleaning session dir tree
> > openmpi-sessions-ndelader@myhost_0
> > orte-clean: killing any lingering procs
> > orte-clean: found potential rogue orterun process
> > (pid=2399,user=ndelader), sending SIGKILL...
> > or

Re: [OMPI users] System CPU of openmpi-1.7rc1

2012-10-26 Thread Ralph Castain
I'm not sure - just fishing for possible answers. When we see high cpu usage, 
it usually occurs during MPI communications - when a process is waiting for a 
message to arrive, it polls at a high rate to keep the latency as low as 
possible. Since you have one process "sleep" before calling the finalize 
sequence, it could be that the other process is getting held up on a receive 
and thus eating the cpu.

There really isn't anything special going on during Init/Finalize, and OMPI 
itself doesn't have any MPI communications in there. I'm not familiar with 
MUMPS, but if MUMPS finalize is doing something like an MPI_Barrier to ensure 
the procs finalize together, then that would explain what you see. The docs I 
could find imply there is some MPI embedded in MUMPS, but I couldn't find 
anything specific about finalize.


On Oct 25, 2012, at 6:43 PM, tmish...@jcity.maeda.co.jp wrote:

> 
> 
> Hi Ralph,
> 
> do you really mean "MUMPS finalize"? I don't think it has much relation
> with
> this behavior?
> 
> Anyway, I'm just a mumps user. I have to ask mumps developers about what
> MUMPS
> initailize and finalize does.
> 
> Regartds,
> tmishima
> 
>> Out of curiosity, what does MUMPS finalize do? Does it send a message or
> do a barrier operation?
>> 
>> 
>> On Oct 25, 2012, at 5:53 PM, tmish...@jcity.maeda.co.jp wrote:
>> 
>>> 
>>> 
>>> Hi,
>>> 
>>> I find that system CPU time of openmpi-1.7rc1 is quite different with
>>> that of openmpi-1.6.2 as shown in the attached ganglia display.
>>> 
>>> About 2 years ago, I reported a similar behavior of openmpi-1.4.3.
>>> The testing method is what I used at that time.
>>> (please see my post entitled "SYSTEM CPU with OpenMPI 1.4.3")
>>> 
>>> Is this due to a pre-released version's check routine or does
>>> something go wrong?
>>> 
>>> Best regards,
>>> Tetsuya Mishima
>>> 
>>> --
>>> Testing program:
>>> INCLUDE 'mpif.h'
>>> INCLUDE 'dmumps_struc.h'
>>> TYPE (DMUMPS_STRUC) MUMPS_PAR
>>> c
>>> MUMPS_PAR%COMM = MPI_COMM_WORLD
>>> MUMPS_PAR%SYM = 1
>>> MUMPS_PAR%PAR = 1
>>> MUMPS_PAR%JOB = -1 ! INITIALIZE MUMPS
>>> CALL MPI_INIT(IERR)
>>> CALL DMUMPS(MUMPS_PAR)
>>> c
>>> CALL MPI_COMM_RANK( MPI_COMM_WORLD, MYID, IERR )
>>> IF ( MYID .EQ. 0 ) CALL SLEEP(180) ! WAIT 180 SEC.
>>> c
>>> MUMPS_PAR%JOB = -2 ! FINALIZE MUMPS
>>> CALL DMUMPS(MUMPS_PAR)
>>> CALL MPI_FINALIZE(IERR)
>>> c
>>> END
>>> ( This does nothing but just calls intialize & finalize
>>> routine of MUMPS & MPI)
>>> 
>>> command line : mpirun -host node03 -np 16 ./testrun
>>> 
>>> (See attached file:
> openmpi17rc1-cmp.bmp)___
> 
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users