Hi, Tom,

I tried it. The problem is still there.

Lakee



2008/8/24 Thomas Sadowski <[EMAIL PROTECTED]>

> That might be your problem...Try linking against libmpi_f90.a
>
>
> -Tom
>
>
>
> ------------------------------
> Date: Sat, 23 Aug 2008 10:13:43 +0800
>
> From: [EMAIL PROTECTED]
> Subject: Re: [SIESTA-L] Strange mpi problem
> To: SIESTA-L@listserv.uam.es
>
> Hi, Dear Tom,
>
> I tried "mpirun -np 4 siesta < input.fdf > output.out &". What I get is the
> following:
>
> libibverbs: Fatal: couldn't read uverbs ABI version.
> --------------------------------------------------------------------------
> [0,1,0]: OpenIB on host node1 was unable to find any HCAs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> libibverbs: Fatal: couldn't read uverbs ABI version.
> --------------------------------------------------------------------------
> [0,1,1]: OpenIB on host node1 was unable to find any HCAs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> [node1:11756] *** An error occurred in MPI_Comm_group
> [node1:11757] *** An error occurred in MPI_Comm_group
> [node1:11757] *** on communicator MPI_COMM_WORLD
> [node1:11757] *** MPI_ERR_COMM: invalid communicator
> [node1:11757] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [node1:11756] *** on communicator MPI_COMM_WORLD
> [node1:11756] *** MPI_ERR_COMM: invalid communicator
> [node1:11756] *** MPI_ERRORS_ARE_FATAL (goodbye)
>
> From the FAQ of openmpi official site, it seems that we should use the
> option "--mca btl tcp,self"  for TCP network?
>
> My machines are also x86-64 architecture with RedHat Enterprise Linux 4.
> The compiler is pgi-7.1.4.  My configure options are:
>
> ./configure --prefix=/usr/local/math_library/pgi-7.1.4/openmpi-1.2.6 CC=cc
> CXX=c++ F77=pgf90 F90=pgf90 FC=pgf90 CFLAGS="-O2" FFLAGS="-O2"
>
> Do you think there is anything wrong here, please?
>
> By the way, when I compile BLACS and scalapack,  in the "lib" directory of
> openmpi-1.2.6, I can not see the file "libmpich.a", but I can see it in both
> mpich2 and mvapich2. So I set MPILIB in Bmake.inc of BLACS and SMPLIB in
> SLmake.inc of scalapack to "openmpi-1.2.6/lib/libmpi.la", since only this
> file looks similar to libmpich.a. Is it wrong here, please?
>
> I will appreciate it very much if you can send your log file for
> configuring openmpi, the Bmake.inc and SLmake.inc.
>
> Sincerely,
> Lakee
>
> 2008/8/23 Thomas Sadowski <[EMAIL PROTECTED]>
>
> Lakee,
>
>
> Glad to hear everything worked well with MVAPICH. In regards to OpenMPI, I
> am confused. You are running on a machine with eight CPUs correct? Why
> supply a host file then? Try it without the -hostfile <filename>. Typically,
> my routines are input
>
>
> mpirun -np 4 siesta < input.fdf > output.out &
>
>
> If this still doesn't work, it may have to do with how you configured
> OpenMPI. The machines I run SIESTA on are x86_64 architecture with SuSE
> 10.x. OpenMPI was compiled with Intel Fortran, C compilers. I can send the
> log file, if necessary
>
>
> -Tom
>
>
>
> ------------------------------
> Date: Fri, 22 Aug 2008 22:47:25 +0800
> From: [EMAIL PROTECTED]
> Subject: Re: [SIESTA-L] Strange mpi problem
>
> To: SIESTA-L@listserv.uam.es
>
> Hello, Dear Tom,
>
> Thank you so much for your experience and suggestion. I also have tried
> MVAPICH and OpenMPI today.
>
> For me, MVAPICH also works very well! :-)
>
> However, OpenMPI does not work for more than 1 processor. Every time after
> I run the command:
>
>  mpirun --mca btl tcp,self -np 4  --hostfile hostfile siesta <input.fdf
> >output
>
> I always got the following error message
>
> [node1:10222] *** An error occurred in MPI_Comm_group
> [node1:10222] *** on communicator MPI_COMM_WORLD
> [node1:10222] *** MPI_ERR_COMM: invalid communicator
> [node1:10222] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [node1:10223] *** An error occurred in MPI_Comm_group
> [node1:10223] *** on communicator MPI_COMM_WORLD
> [node1:10223] *** MPI_ERR_COMM: invalid communicator
> [node1:10223] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [node1:10224] *** An error occurred in MPI_Comm_group
> [node1:10224] *** on communicator MPI_COMM_WORLD
> [node1:10224] *** MPI_ERR_COMM: invalid communicator
> [node1:10224] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [node1:10225] *** An error occurred in MPI_Comm_group
> [node1:10225] *** on communicator MPI_COMM_WORLD
> [node1:10225] *** MPI_ERR_COMM: invalid communicator
> [node1:10225] *** MPI_ERRORS_ARE_FATAL (goodbye)
> [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file
> base/pls_base_orted_cmds.c at line 275
> [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at
> line 1166
> [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at line
> 90
> [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file
> base/pls_base_orted_cmds.c at line 188
> [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at
> line 1198
> --------------------------------------------------------------------------
> mpirun was unable to cleanly terminate the daemons for this job. Returned
> value Timeout instead of ORTE_SUCCESS.
> --------------------------------------------------------------------------
>
> [1]+  Exit 1                  mpirun --mca btl tcp,self -np 4 --hostfile
> hostfile siesta <input.fdf >output
>
> But when I check the output file, I can get the following lines at the end:
>
> ...
> * Maximum dynamic memory allocated =     1 MB
>
> siesta:                 ==============================
>                             Begin CG move =      0
>                         ==============================
>
> outcell: Unit cell vectors (Ang):
>        12.000000    0.000000    0.000000
>         0.000000   12.000000    0.000000
>         0.000000    0.000000   41.507400
>
> outcell: Cell vector modules (Ang)   :   12.000000   12.000000   41.507400
> outcell: Cell angles (23,13,12) (deg):     90.0000     90.0000     90.0000
> outcell: Cell volume (Ang**3)        :   5977.0656
>
> InitMesh: MESH =   108 x   108 x   360 =     4199040
> InitMesh: Mesh cutoff (required, used) =   200.000   207.901 Ry
>
> * Maximum dynamic memory allocated =   163 MB
>
> So, the calculation started normally, but it stopped suddengly.  I dit it
> on a single machine with 8 cores. Do you have any idea about it, please?
>
> Sincerely,
> Lakee
>
>
> 2008/8/20 Thomas Sadowski <[EMAIL PROTECTED]>
>
> Lakee,
>
>
> I cannt speak about the serial sleep, but I have encountered similar
> problems trying to run parallel SIESTA using MPICH2 as my MPI interface. I
> would recommend considering one of the other MPI programs and see if this
> doesn't solve the problem. Myself I use both OpenMPI and MVAPICH. Depending
> on how aggressively the libraries are compiled, the latter tends to run a
> little faster than the former, but it is not really that significant of a
> difference. I would be interested to hear what the other users have to say
> concerning this issue.
>
>
> Tom Sadowski
> University of Connecticut
>
>
>
> ------------------------------
> Date: Wed, 20 Aug 2008 22:06:20 +0800
> From: [EMAIL PROTECTED]
> Subject: [SIESTA-L] Strange mpi problem
> To: SIESTA-L@listserv.uam.es
>
>
> Hello, Dear all,
>
> These days, I was trying to run a calculation with the parallel version of
> siesta on a PC cluster. It was compiled by mpich2 and pgi compiler. What
> surprises me is that, sometimes, it runs normally, and sometimes, the task
> enters a sleeping status right after I submit the job by "mpiexec -n 8
> siesta <input> output".  In the output of the command "top", I can see "S"
> as the status on the line of my job. At this time ,it never goes on.  This
> happens very frequently, even for the same input file.  I do not know why.
> Could you tell me how to avoid entering a sleeping state right after the
> submission of the job, please?
>
> Thank you very much!!
>
> Sincerely,
> Lakee
>
> ------------------------------
> See what people are saying about Windows Live. Check out featured posts. Check
> It Out!<http://www.windowslive.com/connect?ocid=TXT_TAGLM_WL_connect2_082008>
>
>
>
> ------------------------------
> Get thousands of games on your PC, your mobile phone, and the web with
> Windows(R). Game with 
> Windows<http://clk.atdmt.com/MRT/go/108588800/direct/01/>
>
>
>
> ------------------------------
> Talk to your Yahoo! Friends via Windows Live Messenger. Find Out 
> How<http://www.windowslive.com/explore/messenger?ocid=TXT_TAGLM_WL_messenger_yahoo_082008>
>

Reply via email to