Re: [Wien] error in running .machines file

2018-06-16 Thread Gavin Abo
The "ssh cn308 ldd $WIENROOT/lapw0_mpi" is finding files for your ifort 
installation like 
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_scalapack_lp64.so 
just fine.  So your environmental variables seem to be setup and working 
fine on both nodes.  It looks like the 
/opt/intel/impi/5.0.2.044/intel64/lib/libmpifort.so.12 exists on the 
renwei node but not on the cn308 node.    It looks to me that Intel MPI 
(impi) is not installed on the cn308 node.


Perhaps the cn308 node is using a different partition or different 
shared drive.  I have read that there are different possible solutions 
for the slurm cluster problem you seem to have which depend on how it is 
configured [ 
https://lists.schedmd.com/pipermail/slurm-users/2017-December/000272.html 
].  You might be able to check which partition the renwei node and cn308 
node are using with sinfo [ https://slurm.schedmd.com/sinfo.html ].


Maybe you just have to have your cluster manager (administrator, help 
desk, ...) install impi like what you did for ifort.  To remove the 
"manpath: command not found", the cluster manager probably just has to 
install the man or man-db package on the cn308 node (they should be able 
to check the documentation or forums for the OS that their cluster is 
using on how to install manpath, typically for example: yum install man 
or apt-get install man-db).  I have never performed administration 
functions of a slurm cluster, so for additional help with your problem 
you may have to ask a slurm expert (e.g., your cluster manager or the 
slurm mailing list [ https://slurm.schedmd.com/mail.html ]).


On 6/16/2018 4:28 AM, venkatesh chandragiri wrote:


Dear Prof. Marks,

I did "ssh othernode ldd $WIENROOT/lapw0_mpi".

=

[renwei@ln3 ~]$  ssh cn308 ldd $WIENROOT/lapw0_mpi
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh: line 
118: manpath: command not found

    linux-vdso.so.1 =>  (0x7fffd8fff000)
    libfftw3_mpi.so.3 => 
/THFS/home/renwei/venky/soft/fftw/lib/libfftw3_mpi.so.3 
(0x7fd41621d000)
    libmkl_scalapack_lp64.so => 
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_scalapack_lp64.so 
(0x7fd415947000)
    libmkl_blacs_intelmpi_lp64.so => 
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so 
(0x7fd41570a000)
    libfftw3.so.3 => 
/THFS/home/renwei/venky/soft/fftw/lib/libfftw3.so.3 (0x7fd4153fe000)
    libmkl_intel_lp64.so => 
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_intel_lp64.so 
(0x7fd414cb)
    libmkl_intel_thread.so => 
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_intel_thread.so 
(0x7fd413c9)
    libmkl_core.so => 
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_core.so 
(0x7fd41259c000)

    libpthread.so.0 => /lib64/libpthread.so.0 (0x7fd41238)
*_    libmpifort.so.12 => not found
    libmpi.so.12 => not found_*
    libdl.so.2 => /lib64/libdl.so.2 (0x7fd412172000)
    librt.so.1 => /lib64/librt.so.1 (0x7fd411f69000)
    libm.so.6 => /lib64/libm.so.6 (0x7fd411ce5000)
    libiomp5.so => 
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libiomp5.so 
(0x7fd4119ca000)

    libc.so.6 => /lib64/libc.so.6 (0x7fd411628000)
    libgcc_s.so.1 => 
/THFS/home/sh-hzw2/software/Matlab2014a//sys/os/glnxa64/libgcc_s.so.1 
(0x7fd411413000)
    libimf.so => 
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libimf.so 
(0x7fd410f5)
    libsvml.so => 
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libsvml.so 
(0x7fd410354000)
    libirng.so => 
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libirng.so 
(0x7fd41014d000)
    libintlc.so.5 => 
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libintlc.so.5 
(0x7fd40fef7000)

    /lib64/ld-linux-x86-64.so.2 (0x7fd416436000)

=

As it is shown here *_    libmpifort.so.12 => not found, 
libmpi.so.12 => not found when I run in cn308 node_*


But these have well defined paths when run ldd at "renwei"

    libmpifort.so.12 => 
/opt/intel/impi/5.0.2.044/intel64/lib/libmpifort.so.12 
 (0x2b3a37c98000)
    libmpi.so.12 => 
/opt/intel/impi/5.0.2.044/intel64/lib/libmpi.so.12 
 (0x2b3a37f21000)


===

[renwei@ln3 ~]$ ssh cn308 $WIENROOT/lapw0_mpi
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh: line 
118: manpath: command not found
/THFS/home/renwei/venky/soft/wien2k/lapw0_mpi: error while loading 
shared libraries: libmpifort.so.12: cannot open shared object file: No 
such file or directory

[renwei@ln3 ~]$


===

[renwei@ln3 ~]$ ssh cn308
Last login: Sat Jun 16 17:59:04 2018 from ln3-gn0
-bash: manpath: command not found
[renwei@cn308 ~]$ 

Re: [Wien] error in running .machines file

2018-06-16 Thread venkatesh chandragiri
Dear Prof. Marks,

I did "ssh othernode ldd $WIENROOT/lapw0_mpi".

=

[renwei@ln3 ~]$  ssh cn308 ldd $WIENROOT/lapw0_mpi
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh: line 118:
manpath: command not found
linux-vdso.so.1 =>  (0x7fffd8fff000)
libfftw3_mpi.so.3 =>
/THFS/home/renwei/venky/soft/fftw/lib/libfftw3_mpi.so.3 (0x7fd41621d000)
libmkl_scalapack_lp64.so =>
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_scalapack_lp64.so
(0x7fd415947000)
libmkl_blacs_intelmpi_lp64.so =>
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so
(0x7fd41570a000)
libfftw3.so.3 =>
/THFS/home/renwei/venky/soft/fftw/lib/libfftw3.so.3 (0x7fd4153fe000)
libmkl_intel_lp64.so =>
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_intel_lp64.so
(0x7fd414cb)
libmkl_intel_thread.so =>
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_intel_thread.so
(0x7fd413c9)
libmkl_core.so =>
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64/libmkl_core.so
(0x7fd41259c000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x7fd41238)

*libmpifort.so.12 => not foundlibmpi.so.12 => not found*
libdl.so.2 => /lib64/libdl.so.2 (0x7fd412172000)
librt.so.1 => /lib64/librt.so.1 (0x7fd411f69000)
libm.so.6 => /lib64/libm.so.6 (0x7fd411ce5000)
libiomp5.so =>
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libiomp5.so
(0x7fd4119ca000)
libc.so.6 => /lib64/libc.so.6 (0x7fd411628000)
libgcc_s.so.1 =>
/THFS/home/sh-hzw2/software/Matlab2014a//sys/os/glnxa64/libgcc_s.so.1
(0x7fd411413000)
libimf.so =>
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libimf.so
(0x7fd410f5)
libsvml.so =>
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libsvml.so
(0x7fd410354000)
libirng.so =>
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libirng.so
(0x7fd41014d000)
libintlc.so.5 =>
/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64/libintlc.so.5
(0x7fd40fef7000)
/lib64/ld-linux-x86-64.so.2 (0x7fd416436000)

=

As it is shown here *libmpifort.so.12 => not found,
libmpi.so.12 => not found when I run in cn308 node*

But these have well defined paths when run ldd at "renwei"

libmpifort.so.12 => /opt/intel/impi/
5.0.2.044/intel64/lib/libmpifort.so.12 (0x2b3a37c98000)
libmpi.so.12 => /opt/intel/impi/5.0.2.044/intel64/lib/libmpi.so.12
(0x2b3a37f21000)

===

[renwei@ln3 ~]$ ssh cn308 $WIENROOT/lapw0_mpi
/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh: line 118:
manpath: command not found
/THFS/home/renwei/venky/soft/wien2k/lapw0_mpi: error while loading shared
libraries: libmpifort.so.12: cannot open shared object file: No such file
or directory
[renwei@ln3 ~]$


===

[renwei@ln3 ~]$ ssh cn308
Last login: Sat Jun 16 17:59:04 2018 from ln3-gn0
-bash: manpath: command not found
[renwei@cn308 ~]$ $WIENROOT/lapw0_mpi
/THFS/home/renwei/venky/soft/wien2k/lapw0_mpi: error while loading shared
libraries: libmpifort.so.12: cannot open shared object file: No such file
or directory


**

You also mentioned to use " use static compilation". I don't understand
this. do you meant to be static compilation of wien2k..? how I can do it (I
am sorry to ask this, as I belongs to experimental background I don't come
across these kind of issues).


thank you.

venkatesh
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] error in running .machines file

2018-06-16 Thread Peter Blaha

cd $WIENROOT
edit parallele_options and setUSE_REMOTE and MPI_REMOTE to zero.

Then there is no ssh anymore. (But you can use only one node for k-parallel)

Regards

Am 16.06.2018 um 12:02 schrieb venkatesh chandragiri:

Dear Prof. Gavin,

I am using slurm based environment for running the jobs. I have attached 
the typical script I made to submit the job. Although, I kept export & 
source of  LD_LIBRARY_PATH and path to the compilervars.sh, I have also 
source them again by keeping them in separate "myenev" file.


===
#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --job-name=MnSb2
#SBATCH --output=out%j.txt
#SBATCH --uid=renwei
#SBATCH --partition=sz-renwei
export OMP_NUM_THREADS=1

export PATH="/THFS/home/renwei/softwares/anaconda2/bin:$PATH"
export 
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/THFS/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64
export 
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64
export 
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/THFS/home/renwei/venky/soft/libxc/lib
export 
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/THFS/home/renwei/venky/soft/fftw/lib
export 
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/intel/impi/5.0.2.044/intel64/lib 


export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/lib64
export WIENROOT=/THFS/home/renwei/venky/soft/wien2k
source /THFS/opt/intel/composer_xe_2013_sp1.3.174/bin/compilervars.sh 
intel64

source /THFS/opt/intel/composer_xe_2013_sp1.3.174/bin/ifortvars.sh intel64
source /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh intel64
source /THFS/opt/intel/impi/5.0.2.044/intel64/bin/mpivars.sh 
 intel64


source myenev

===
script to generate .machines file
=

wien2k=`runsp_lapw -NI -i 200 -ec 0.1 -cc 0.0001 -p`

srun $wien2k

===


The calculations are running on user account named "renwei" and we have 
a group of students who are using the same account by creating separate 
folders into it. Wien2k was installed in my local folder 
"venky/soft/wien2k" and calculations are doing from "venky/wien2k_sim/MnSb".


This renwei account already contain the .ssh folder . This folder have 
both "id_rsa.pub" and "authorized_keys" files. The content of id_rsa.pub 
file is already copied into authorized_keys file.


After following your statement in earlier mail , the permission to the 
authorized_keys was look like


-rw-r-  authorized_keys
-rw-r--r--  id_rsa.pub

I did ssh of one of the node, it do not prompt me to password as given 
below.


[renwei@ln3 ~]$ ssh cn308
Last login: Sat Jun 16 01:20:03 2018 from ln3-gn0
-bash: manpath: command not found
[renwei@cn308 ~]$


Now after doing all these, the error still persists.


venkatesh



___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html



--
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300 FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.atWIEN2k: http://www.wien2k.at
WWW: 
http://www.imc.tuwien.ac.at/tc_blaha- 


___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


Re: [Wien] error in running .machines file

2018-06-16 Thread venkatesh chandragiri
Dear Prof. Gavin,

I am using slurm based environment for running the jobs. I have attached
the typical script I made to submit the job. Although, I kept export &
source of  LD_LIBRARY_PATH and path to the compilervars.sh, I have also
source them again by keeping them in separate "myenev" file.

===
#!/bin/bash -l
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=24
#SBATCH --job-name=MnSb2
#SBATCH --output=out%j.txt
#SBATCH --uid=renwei
#SBATCH --partition=sz-renwei
export OMP_NUM_THREADS=1

export PATH="/THFS/home/renwei/softwares/anaconda2/bin:$PATH"
export
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/THFS/opt/intel/composer_xe_2013_sp1.3.174/compiler/lib/intel64
export
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/lib/intel64
export
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/THFS/home/renwei/venky/soft/libxc/lib
export
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/THFS/home/renwei/venky/soft/fftw/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/intel/impi/
5.0.2.044/intel64/lib
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/lib64
export WIENROOT=/THFS/home/renwei/venky/soft/wien2k
source /THFS/opt/intel/composer_xe_2013_sp1.3.174/bin/compilervars.sh
intel64
source /THFS/opt/intel/composer_xe_2013_sp1.3.174/bin/ifortvars.sh intel64
source /THFS/opt/intel/composer_xe_2013_sp1.3.174/mkl/bin/mklvars.sh intel64
source /THFS/opt/intel/impi/5.0.2.044/intel64/bin/mpivars.sh intel64

source myenev

===
script to generate .machines file
=

wien2k=`runsp_lapw -NI -i 200 -ec 0.1 -cc 0.0001 -p`

srun $wien2k

===


The calculations are running on user account named "renwei" and we have a
group of students who are using the same account by creating separate
folders into it. Wien2k was installed in my local folder
"venky/soft/wien2k" and calculations are doing from "venky/wien2k_sim/MnSb".

This renwei account already contain the .ssh folder . This folder have both
"id_rsa.pub" and "authorized_keys" files. The content of id_rsa.pub file is
already copied into authorized_keys file.

After following your statement in earlier mail , the permission to the
authorized_keys was look like

-rw-r-  authorized_keys
-rw-r--r--  id_rsa.pub

I did ssh of one of the node, it do not prompt me to password as given
below.

[renwei@ln3 ~]$ ssh cn308
Last login: Sat Jun 16 01:20:03 2018 from ln3-gn0
-bash: manpath: command not found
[renwei@cn308 ~]$


Now after doing all these, the error still persists.


venkatesh
___
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html