Hi, Tom, I tried it. The problem is still there.
Lakee 2008/8/24 Thomas Sadowski <[EMAIL PROTECTED]> > That might be your problem...Try linking against libmpi_f90.a > > > -Tom > > > > ------------------------------ > Date: Sat, 23 Aug 2008 10:13:43 +0800 > > From: [EMAIL PROTECTED] > Subject: Re: [SIESTA-L] Strange mpi problem > To: SIESTA-L@listserv.uam.es > > Hi, Dear Tom, > > I tried "mpirun -np 4 siesta < input.fdf > output.out &". What I get is the > following: > > libibverbs: Fatal: couldn't read uverbs ABI version. > -------------------------------------------------------------------------- > [0,1,0]: OpenIB on host node1 was unable to find any HCAs. > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > libibverbs: Fatal: couldn't read uverbs ABI version. > -------------------------------------------------------------------------- > [0,1,1]: OpenIB on host node1 was unable to find any HCAs. > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > [node1:11756] *** An error occurred in MPI_Comm_group > [node1:11757] *** An error occurred in MPI_Comm_group > [node1:11757] *** on communicator MPI_COMM_WORLD > [node1:11757] *** MPI_ERR_COMM: invalid communicator > [node1:11757] *** MPI_ERRORS_ARE_FATAL (goodbye) > [node1:11756] *** on communicator MPI_COMM_WORLD > [node1:11756] *** MPI_ERR_COMM: invalid communicator > [node1:11756] *** MPI_ERRORS_ARE_FATAL (goodbye) > > From the FAQ of openmpi official site, it seems that we should use the > option "--mca btl tcp,self" for TCP network? > > My machines are also x86-64 architecture with RedHat Enterprise Linux 4. > The compiler is pgi-7.1.4. My configure options are: > > ./configure --prefix=/usr/local/math_library/pgi-7.1.4/openmpi-1.2.6 CC=cc > CXX=c++ F77=pgf90 F90=pgf90 FC=pgf90 CFLAGS="-O2" FFLAGS="-O2" > > Do you think there is anything wrong here, please? > > By the way, when I compile BLACS and scalapack, in the "lib" directory of > openmpi-1.2.6, I can not see the file "libmpich.a", but I can see it in both > mpich2 and mvapich2. So I set MPILIB in Bmake.inc of BLACS and SMPLIB in > SLmake.inc of scalapack to "openmpi-1.2.6/lib/libmpi.la", since only this > file looks similar to libmpich.a. Is it wrong here, please? > > I will appreciate it very much if you can send your log file for > configuring openmpi, the Bmake.inc and SLmake.inc. > > Sincerely, > Lakee > > 2008/8/23 Thomas Sadowski <[EMAIL PROTECTED]> > > Lakee, > > > Glad to hear everything worked well with MVAPICH. In regards to OpenMPI, I > am confused. You are running on a machine with eight CPUs correct? Why > supply a host file then? Try it without the -hostfile <filename>. Typically, > my routines are input > > > mpirun -np 4 siesta < input.fdf > output.out & > > > If this still doesn't work, it may have to do with how you configured > OpenMPI. The machines I run SIESTA on are x86_64 architecture with SuSE > 10.x. OpenMPI was compiled with Intel Fortran, C compilers. I can send the > log file, if necessary > > > -Tom > > > > ------------------------------ > Date: Fri, 22 Aug 2008 22:47:25 +0800 > From: [EMAIL PROTECTED] > Subject: Re: [SIESTA-L] Strange mpi problem > > To: SIESTA-L@listserv.uam.es > > Hello, Dear Tom, > > Thank you so much for your experience and suggestion. I also have tried > MVAPICH and OpenMPI today. > > For me, MVAPICH also works very well! :-) > > However, OpenMPI does not work for more than 1 processor. Every time after > I run the command: > > mpirun --mca btl tcp,self -np 4 --hostfile hostfile siesta <input.fdf > >output > > I always got the following error message > > [node1:10222] *** An error occurred in MPI_Comm_group > [node1:10222] *** on communicator MPI_COMM_WORLD > [node1:10222] *** MPI_ERR_COMM: invalid communicator > [node1:10222] *** MPI_ERRORS_ARE_FATAL (goodbye) > [node1:10223] *** An error occurred in MPI_Comm_group > [node1:10223] *** on communicator MPI_COMM_WORLD > [node1:10223] *** MPI_ERR_COMM: invalid communicator > [node1:10223] *** MPI_ERRORS_ARE_FATAL (goodbye) > [node1:10224] *** An error occurred in MPI_Comm_group > [node1:10224] *** on communicator MPI_COMM_WORLD > [node1:10224] *** MPI_ERR_COMM: invalid communicator > [node1:10224] *** MPI_ERRORS_ARE_FATAL (goodbye) > [node1:10225] *** An error occurred in MPI_Comm_group > [node1:10225] *** on communicator MPI_COMM_WORLD > [node1:10225] *** MPI_ERR_COMM: invalid communicator > [node1:10225] *** MPI_ERRORS_ARE_FATAL (goodbye) > [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file > base/pls_base_orted_cmds.c at line 275 > [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at > line 1166 > [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at line > 90 > [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file > base/pls_base_orted_cmds.c at line 188 > [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at > line 1198 > -------------------------------------------------------------------------- > mpirun was unable to cleanly terminate the daemons for this job. Returned > value Timeout instead of ORTE_SUCCESS. > -------------------------------------------------------------------------- > > [1]+ Exit 1 mpirun --mca btl tcp,self -np 4 --hostfile > hostfile siesta <input.fdf >output > > But when I check the output file, I can get the following lines at the end: > > ... > * Maximum dynamic memory allocated = 1 MB > > siesta: ============================== > Begin CG move = 0 > ============================== > > outcell: Unit cell vectors (Ang): > 12.000000 0.000000 0.000000 > 0.000000 12.000000 0.000000 > 0.000000 0.000000 41.507400 > > outcell: Cell vector modules (Ang) : 12.000000 12.000000 41.507400 > outcell: Cell angles (23,13,12) (deg): 90.0000 90.0000 90.0000 > outcell: Cell volume (Ang**3) : 5977.0656 > > InitMesh: MESH = 108 x 108 x 360 = 4199040 > InitMesh: Mesh cutoff (required, used) = 200.000 207.901 Ry > > * Maximum dynamic memory allocated = 163 MB > > So, the calculation started normally, but it stopped suddengly. I dit it > on a single machine with 8 cores. Do you have any idea about it, please? > > Sincerely, > Lakee > > > 2008/8/20 Thomas Sadowski <[EMAIL PROTECTED]> > > Lakee, > > > I cannt speak about the serial sleep, but I have encountered similar > problems trying to run parallel SIESTA using MPICH2 as my MPI interface. I > would recommend considering one of the other MPI programs and see if this > doesn't solve the problem. Myself I use both OpenMPI and MVAPICH. Depending > on how aggressively the libraries are compiled, the latter tends to run a > little faster than the former, but it is not really that significant of a > difference. I would be interested to hear what the other users have to say > concerning this issue. > > > Tom Sadowski > University of Connecticut > > > > ------------------------------ > Date: Wed, 20 Aug 2008 22:06:20 +0800 > From: [EMAIL PROTECTED] > Subject: [SIESTA-L] Strange mpi problem > To: SIESTA-L@listserv.uam.es > > > Hello, Dear all, > > These days, I was trying to run a calculation with the parallel version of > siesta on a PC cluster. It was compiled by mpich2 and pgi compiler. What > surprises me is that, sometimes, it runs normally, and sometimes, the task > enters a sleeping status right after I submit the job by "mpiexec -n 8 > siesta <input> output". In the output of the command "top", I can see "S" > as the status on the line of my job. At this time ,it never goes on. This > happens very frequently, even for the same input file. I do not know why. > Could you tell me how to avoid entering a sleeping state right after the > submission of the job, please? > > Thank you very much!! > > Sincerely, > Lakee > > ------------------------------ > See what people are saying about Windows Live. Check out featured posts. Check > It Out!<http://www.windowslive.com/connect?ocid=TXT_TAGLM_WL_connect2_082008> > > > > ------------------------------ > Get thousands of games on your PC, your mobile phone, and the web with > Windows(R). Game with > Windows<http://clk.atdmt.com/MRT/go/108588800/direct/01/> > > > > ------------------------------ > Talk to your Yahoo! Friends via Windows Live Messenger. Find Out > How<http://www.windowslive.com/explore/messenger?ocid=TXT_TAGLM_WL_messenger_yahoo_082008> >