Hi Lakee,
I think you have a problem with you openmpi setup, it tries to use infiniband,
but it can't find any network cards so it uses standard ethernet.

What I recommend is:
1) Check the tests and examples of openmpi.
2) Compile and test Blacs. If the test fail SIESTA will fail, so don't proceed with siesta installation until blacs works.
3) Idem for scalapack
4) Go for siesta compilation.

Regards,
Eduardo


On 24/08/2008, at 5:43, Lakee Johnson wrote:

Hi, Tom,

I tried it. The problem is still there.

Lakee



2008/8/24 Thomas Sadowski <[EMAIL PROTECTED]>
That might be your problem...Try linking against libmpi_f90.a


-Tom



Date: Sat, 23 Aug 2008 10:13:43 +0800

From: [EMAIL PROTECTED]
Subject: Re: [SIESTA-L] Strange mpi problem
To: SIESTA-L@listserv.uam.es

Hi, Dear Tom,

I tried "mpirun -np 4 siesta < input.fdf > output.out &". What I get is the following:

libibverbs: Fatal: couldn't read uverbs ABI version.
--------------------------------------------------------------------------
[0,1,0]: OpenIB on host node1 was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
libibverbs: Fatal: couldn't read uverbs ABI version.
--------------------------------------------------------------------------
[0,1,1]: OpenIB on host node1 was unable to find any HCAs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
[node1:11756] *** An error occurred in MPI_Comm_group
[node1:11757] *** An error occurred in MPI_Comm_group
[node1:11757] *** on communicator MPI_COMM_WORLD
[node1:11757] *** MPI_ERR_COMM: invalid communicator
[node1:11757] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1:11756] *** on communicator MPI_COMM_WORLD
[node1:11756] *** MPI_ERR_COMM: invalid communicator
[node1:11756] *** MPI_ERRORS_ARE_FATAL (goodbye)

From the FAQ of openmpi official site, it seems that we should use the option "--mca btl tcp,self" for TCP network?

My machines are also x86-64 architecture with RedHat Enterprise Linux 4. The compiler is pgi-7.1.4. My configure options are:

./configure --prefix=/usr/local/math_library/pgi-7.1.4/openmpi-1.2.6 CC=cc CXX=c++ F77=pgf90 F90=pgf90 FC=pgf90 CFLAGS="-O2" FFLAGS="-O2"

Do you think there is anything wrong here, please?

By the way, when I compile BLACS and scalapack, in the "lib" directory of openmpi-1.2.6, I can not see the file "libmpich.a", but I can see it in both mpich2 and mvapich2. So I set MPILIB in Bmake.inc of BLACS and SMPLIB in SLmake.inc of scalapack to "openmpi-1.2.6/lib/libmpi.la", since only this file looks similar to libmpich.a. Is it wrong here, please?

I will appreciate it very much if you can send your log file for configuring openmpi, the Bmake.inc and SLmake.inc.

Sincerely,
Lakee

2008/8/23 Thomas Sadowski <[EMAIL PROTECTED]>
Lakee,


Glad to hear everything worked well with MVAPICH. In regards to OpenMPI, I am confused. You are running on a machine with eight CPUs correct? Why supply a host file then? Try it without the -hostfile <filename>. Typically, my routines are input


mpirun -np 4 siesta < input.fdf > output.out &


If this still doesn't work, it may have to do with how you configured OpenMPI. The machines I run SIESTA on are x86_64 architecture with SuSE 10.x. OpenMPI was compiled with Intel Fortran, C compilers. I can send the log file, if necessary


-Tom



Date: Fri, 22 Aug 2008 22:47:25 +0800

From: [EMAIL PROTECTED]
Subject: Re: [SIESTA-L] Strange mpi problem

To: SIESTA-L@listserv.uam.es

Hello, Dear Tom,

Thank you so much for your experience and suggestion. I also have tried MVAPICH and OpenMPI today.

For me, MVAPICH also works very well! :-)

However, OpenMPI does not work for more than 1 processor. Every time after I run the command:

mpirun --mca btl tcp,self -np 4 --hostfile hostfile siesta <input.fdf >output

I always got the following error message

[node1:10222] *** An error occurred in MPI_Comm_group
[node1:10222] *** on communicator MPI_COMM_WORLD
[node1:10222] *** MPI_ERR_COMM: invalid communicator
[node1:10222] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1:10223] *** An error occurred in MPI_Comm_group
[node1:10223] *** on communicator MPI_COMM_WORLD
[node1:10223] *** MPI_ERR_COMM: invalid communicator
[node1:10223] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1:10224] *** An error occurred in MPI_Comm_group
[node1:10224] *** on communicator MPI_COMM_WORLD
[node1:10224] *** MPI_ERR_COMM: invalid communicator
[node1:10224] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1:10225] *** An error occurred in MPI_Comm_group
[node1:10225] *** on communicator MPI_COMM_WORLD
[node1:10225] *** MPI_ERR_COMM: invalid communicator
[node1:10225] *** MPI_ERRORS_ARE_FATAL (goodbye)
[node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/ pls_base_orted_cmds.c at line 275 [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1166 [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file errmgr_hnp.c at line 90 [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file base/ pls_base_orted_cmds.c at line 188 [node1:10219] [0,0,0] ORTE_ERROR_LOG: Timeout in file pls_rsh_module.c at line 1198
--------------------------------------------------------------------------
mpirun was unable to cleanly terminate the daemons for this job. Returned value Timeout instead of ORTE_SUCCESS.
--------------------------------------------------------------------------

[1]+ Exit 1 mpirun --mca btl tcp,self -np 4 -- hostfile hostfile siesta <input.fdf >output

But when I check the output file, I can get the following lines at the end:

...
* Maximum dynamic memory allocated =     1 MB

siesta:                 ==============================
                            Begin CG move =      0
                        ==============================

outcell: Unit cell vectors (Ang):
       12.000000    0.000000    0.000000
        0.000000   12.000000    0.000000
        0.000000    0.000000   41.507400

outcell: Cell vector modules (Ang) : 12.000000 12.000000 41.507400 outcell: Cell angles (23,13,12) (deg): 90.0000 90.0000 90.0000
outcell: Cell volume (Ang**3)        :   5977.0656

InitMesh: MESH =   108 x   108 x   360 =     4199040
InitMesh: Mesh cutoff (required, used) =   200.000   207.901 Ry

* Maximum dynamic memory allocated =   163 MB

So, the calculation started normally, but it stopped suddengly. I dit it on a single machine with 8 cores. Do you have any idea about it, please?

Sincerely,
Lakee


2008/8/20 Thomas Sadowski <[EMAIL PROTECTED]>
Lakee,


I cannt speak about the serial sleep, but I have encountered similar problems trying to run parallel SIESTA using MPICH2 as my MPI interface. I would recommend considering one of the other MPI programs and see if this doesn't solve the problem. Myself I use both OpenMPI and MVAPICH. Depending on how aggressively the libraries are compiled, the latter tends to run a little faster than the former, but it is not really that significant of a difference. I would be interested to hear what the other users have to say concerning this issue.


Tom Sadowski
University of Connecticut



Date: Wed, 20 Aug 2008 22:06:20 +0800
From: [EMAIL PROTECTED]
Subject: [SIESTA-L] Strange mpi problem
To: SIESTA-L@listserv.uam.es


Hello, Dear all,

These days, I was trying to run a calculation with the parallel version of siesta on a PC cluster. It was compiled by mpich2 and pgi compiler. What surprises me is that, sometimes, it runs normally, and sometimes, the task enters a sleeping status right after I submit the job by "mpiexec -n 8 siesta <input> output". In the output of the command "top", I can see "S" as the status on the line of my job. At this time ,it never goes on. This happens very frequently, even for the same input file. I do not know why. Could you tell me how to avoid entering a sleeping state right after the submission of the job, please?

Thank you very much!!

Sincerely,
Lakee

See what people are saying about Windows Live. Check out featured posts. Check It Out!


Get thousands of games on your PC, your mobile phone, and the web with Windows®. Game with Windows


Talk to your Yahoo! Friends via Windows Live Messenger. Find Out How


Reply via email to