The "ssh cn308 ldd $WIENROOT/lapw0_mpi" is finding files for your ifort installation like /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_scalapack_lp64.so just fine.  So your environmental variables seem to be setup and working fine on both nodes.  It looks like the /opt/intel/impi/5.0.2.044/intel64/lib/libmpifort.so.12 exists on the renwei node but not on the cn308 node.    It looks to me that Intel MPI (impi) is not installed on the cn308 node.

Perhaps the cn308 node is using a different partition or different shared drive.  I have read that there are different possible solutions for the slurm cluster problem you seem to have which depend on how it is configured [ https://lists.schedmd.com/pipermail/slurm-users/2017-December/000272.html ].  You might be able to check which partition the renwei node and cn308 node are using with sinfo [ https://slurm.schedmd.com/sinfo.html ].

Maybe you just have to have your cluster manager (administrator, help desk, ...) install impi like what you did for ifort.  To remove the "manpath: command not found", the cluster manager probably just has to install the man or man-db package on the cn308 node (they should be able to check the documentation or forums for the OS that their cluster is using on how to install manpath, typically for example: yum install man or apt-get install man-db).  I have never performed administration functions of a slurm cluster, so for additional help with your problem you may have to ask a slurm expert (e.g., your cluster manager or the slurm mailing list [ https://slurm.schedmd.com/mail.html ]).

On 6/16/2018 4:28 AM, venkatesh chandragiri wrote:

Dear Prof. Marks,

I did "ssh othernode ldd $WIENROOT/lapw0_mpi".

=========

[renwei@ln3 ~]$  ssh cn308 ldd $WIENROOT/lapw0_mpi
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh: line 118: manpath: command not found
        linux-vdso.so.1 =>  (0x00007fffd8fff000)
        libfftw3_mpi.so.3 => /THFS/home/renwei/venky/soft/fftw/lib/libfftw3_mpi.so.3 (0x00007fd41621d000)         libmkl_scalapack_lp64.so => /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_scalapack_lp64.so (0x00007fd415947000)         libmkl_blacs_intelmpi_lp64.so => /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so (0x00007fd41570a000)         libfftw3.so.3 => /THFS/home/renwei/venky/soft/fftw/lib/libfftw3.so.3 (0x00007fd4153fe000)         libmkl_intel_lp64.so => /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_intel_lp64.so (0x00007fd414cb0000)         libmkl_intel_thread.so => /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_intel_thread.so (0x00007fd413c90000)         libmkl_core.so => /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_core.so (0x00007fd41259c000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fd412380000)
*_        libmpifort.so.12 => not found
        libmpi.so.12 => not found_*
        libdl.so.2 => /lib64/libdl.so.2 (0x00007fd412172000)
        librt.so.1 => /lib64/librt.so.1 (0x00007fd411f69000)
        libm.so.6 => /lib64/libm.so.6 (0x00007fd411ce5000)
        libiomp5.so => /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libiomp5.so (0x00007fd4119ca000)
        libc.so.6 => /lib64/libc.so.6 (0x00007fd411628000)
        libgcc_s.so.1 => /THFS/home/sh-hzw2/software/Matlab2014a//sys/os/glnxa64/libgcc_s.so.1 (0x00007fd411413000)         libimf.so => /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libimf.so (0x00007fd410f50000)         libsvml.so => /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libsvml.so (0x00007fd410354000)         libirng.so => /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libirng.so (0x00007fd41014d000)         libintlc.so.5 => /opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libintlc.so.5 (0x00007fd40fef7000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fd416436000)

=========

As it is shown here *_        libmpifort.so.12 => not found,         libmpi.so.12 => not found when I run in cn308 node_*

But these have well defined paths when run ldd at "renwei"

        libmpifort.so.12 => /opt/intel/impi/5.0.2.044/intel64/lib/libmpifort.so.12 <http://5.0.2.044/intel64/lib/libmpifort.so.12> (0x00002b3a37c98000)         libmpi.so.12 => /opt/intel/impi/5.0.2.044/intel64/lib/libmpi.so.12 <http://5.0.2.044/intel64/lib/libmpi.so.12> (0x00002b3a37f21000)

===================

[renwei@ln3 ~]$ ssh cn308 $WIENROOT/lapw0_mpi
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh: line 118: manpath: command not found /THFS/home/renwei/venky/soft/wien2k/lapw0_mpi: error while loading shared libraries: libmpifort.so.12: cannot open shared object file: No such file or directory
[renwei@ln3 ~]$


===============

[renwei@ln3 ~]$ ssh cn308
Last login: Sat Jun 16 17:59:04 2018 from ln3-gn0
-bash: manpath: command not found
[renwei@cn308 ~]$ $WIENROOT/lapw0_mpi
/THFS/home/renwei/venky/soft/wien2k/lapw0_mpi: error while loading shared libraries: libmpifort.so.12: cannot open shared object file: No such file or directory

*
*

*========================*

You also mentioned to use " use static compilation". I don't understand this. do you meant to be static compilation of wien2k..? how I can do it (I am sorry to ask this, as I belongs to experimental background I don't come across these kind of issues).


thank you.

venkatesh
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

Reply via email to