Lakee,

Glad to hear everything worked well with MVAPICH. In regards to OpenMPI, I am 
confused. You are running on a machine with eight CPUs correct? Why supply a 
host file then? Try it without the -hostfile <filename>. Typically, my routines 
are input 


mpirun -np 4 siesta < input.fdf > output.out &


If this still doesn't work, it may have to do with how you configured OpenMPI. 
The machines I run SIESTA on are x86_64 architecture with SuSE 10.x. OpenMPI 
was compiled with Intel Fortran, C compilers. I can send the log file, if 
necessary


-Tom



Date: Fri, 22 Aug 2008 22:47:25 +0800
From: [EMAIL PROTECTED]
Subject: Re: [SIESTA-L] Strange mpi problem
To: SIESTA-L@listserv.uam.es

Hello, Dear Tom,

Thank you so much for your experience and suggestion. I also have tried MVAPICH 
and OpenMPI today. 

For me, MVAPICH also works very well! :-)

However, OpenMPI does not work for more than 1 processor. Every time after I 
run the command:


 mpirun --mca btl tcp,self -np 4  --hostfile hostfile siesta <input.fdf >output

 I always got the following error message

[node1:10222] *** An error occurred in MPI_Comm_group
[node1:10222] *** on communicator MPI_COMM_WORLD

[node1:10222] *** MPI_ERR_COMM: invalid communicator
[node1:10222] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1:10223] *** An error occurred in MPI_Comm_group
[node1:10223] *** on communicator MPI_COMM_WORLD
[node1:10223] *** MPI_ERR_COMM: invalid communicator

[node1:10223] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1:10224] *** An error occurred in MPI_Comm_group
[node1:10224] *** on communicator MPI_COMM_WORLD
[node1:10224] *** MPI_ERR_COMM: invalid communicator
[node1:10224] *** MPI_ERRORS_ARE_FATAL (goodbye)

[node1:10225] *** An error occurred in MPI_Comm_group
[node1:10225] *** on communicator MPI_COMM_WORLD
[node1:10225] *** MPI_ERR_COMM: invalid communicator
[node1:10225] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c at line 275

[node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 
1166
[node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at line 90
[node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file 
base/pls_base_orted_cmds.c at line 188

[node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 
1198
--------------------------------------------------------------------------
mpirun was unable to cleanly terminate the daemons for this job. Returned value 
Timeout instead of ORTE_SUCCESS.

--------------------------------------------------------------------------

[1]+  Exit 1                  mpirun --mca btl tcp,self -np 4 --hostfile 
hostfile siesta <input.fdf >output

But when I check the output file, I can get the following lines at the end:


...
* Maximum dynamic memory allocated =     1 MB

siesta:                 ==============================
                            Begin CG move =      0
                        ==============================


outcell: Unit cell vectors (Ang):
       12.000000    0.000000    0.000000
        0.000000   12.000000    0.000000
        0.000000    0.000000   41.507400

outcell: Cell vector modules (Ang)   :   12.000000   12.000000   41.507400

outcell: Cell angles (23,13,12) (deg):     90.0000     90.0000     90.0000
outcell: Cell volume (Ang**3)        :   5977.0656

InitMesh: MESH =   108 x   108 x   360 =     4199040
InitMesh: Mesh cutoff (required, used) =   200.000   207.901 Ry


* Maximum dynamic memory allocated =   163 MB

So, the calculation started normally, but it stopped suddengly.  I dit it on a 
single machine with 8 cores. Do you have any idea about it, please?

Sincerely,

Lakee


2008/8/20 Thomas Sadowski <[EMAIL PROTECTED]>





Lakee,


I cannt speak about the serial sleep, but I have encountered similar problems 
trying to run parallel SIESTA using MPICH2 as my MPI interface. I would 
recommend considering one of the other MPI programs and see if this doesn't 
solve the problem. Myself I use both OpenMPI and MVAPICH. Depending on how 
aggressively the libraries are compiled, the latter tends to run a little 
faster than the former, but it is not really that significant of a difference. 
I would be interested to hear what the other users have to say concerning this 
issue.



Tom Sadowski
University of Connecticut



Date: Wed, 20 Aug 2008 22:06:20 +0800
From: [EMAIL PROTECTED]
Subject: [SIESTA-L] Strange mpi problem

To: SIESTA-L@listserv.uam.es

Hello, Dear all,

These days, I was trying to run a calculation with the parallel version of 
siesta on a PC cluster. It was compiled by mpich2 and pgi compiler. What 
surprises me is that, sometimes, it runs normally, and sometimes, the task 
enters a sleeping status right after I submit the job by "mpiexec -n 8 siesta 
<input> output".  In the output of the command "top", I can see "S" as the 
status on the line of my job. At this time ,it never goes on.  This happens 
very frequently, even for the same input file.  I do not know why. Could you 
tell me how to avoid entering a sleeping state right after the submission of 
the job, please?



Thank you very much!!

Sincerely,
Lakee


See what people are saying about Windows Live.  Check out featured posts. Check 
It Out!



_________________________________________________________________
Get thousands of games on your PC, your mobile phone, and the web with Windows®.
http://clk.atdmt.com/MRT/go/108588800/direct/01/

Reply via email to