Re: [Wien] Parallel calculation
If you are using gfortran and gcc and it helps, some of the keystrokes I captured are shown below from when I installed WIEN2k 21.1. These are just steps I follow as a guide for getting started and for a configuration that gets WIEN2k on my system up and running quickly. After that, I usually have to go back into ./siteconfig and further adjust the settings to make calculations complete faster by finding better settings to use for my system in the documentation for the compilers. For example, the gfortran documentation is at [1]. Other compilers, like the Intel Fortran compiler [2], should have their own documentation. The mpi parallel settings configured below don't always work depending on the configuration that I need on my computer systems. When that happens, the WIEN2k usersguide [3] and FAQ questions page [4] help with choosing the correct siteconfig settings. Or past posts on parallel calculations in the mailing list archive contained a solution to the issues I encountered. I haven't tried the SRC_mpiutil on the unsupported page [5] with WIEN2k 21.1 but I think that was helpful when I looked at that when using a past WIEN2k version. For the below, I used Open MPI [6], but you could use another mpi implementation [7]. [1] https://gcc.gnu.org/wiki/GFortran [2] https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-oneapi-dev-guide-and-reference/top.html [3] http://www.wien2k.at/reg_user/textbooks/usersguide.pdf [4] http://www.wien2k.at/reg_user/faq/pbs.html [5] http://www.wien2k.at/reg_user/unsupported/ [6] https://www.open-mpi.org/ [7] https://en.wikipedia.org/wiki/MPICH *Installed Ubuntu LTS* https://help.ubuntu.com/community/Installation https://ubuntu.com/#download username@computername:~$ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 20.04.2 LTS Release: 20.04 Codename: focal *Installed XCrySDen* username@computername:~$ cd ~ username@computername:~$ sudo apt update username@computername:~$ sudo apt install tcsh ghostscript octave gnuplot gnuplot-x11 make autoconf libtool perl libquadmath0 gfortran build-essential libglu1-mesa-dev libtogl-dev tcl-dev tk-dev libfftw3-dev libxmu-dev username@computername:~$ wget http://www.xcrysden.org/download/xcrysden-1.6.2.tar.gz username@computername:~$ tar xvf xcrysden-1.6.2.tar.gz username@computername:~$ cd xcrysden-1.6.2 username@computername:~/xcrysden-1.6.2$ cp ./system/Make.sys-shared Make.sys username@computername:~/xcrysden-1.6.2$ make all username@computername:~/xcrysden-1.6.2$ echo 'export XCRYSDEN_TOPDIR=/home/username/xcrysden-1.6.2'>>~/.bashrc username@computername:~/xcrysden-1.6.2$ echo 'export PATH=$PATH:$XCRYSDEN_TOPDIR'>>~/.bashrc username@computername:~/xcrysden-1.6.2$ source ~/.bashrc *Installed libxc* username@computername:~/xcrysden-1.6.2$ cd ~ username@computername:~$ wget http://www.tddft.org/programs/libxc/down.php?file=5.1.4/libxc-5.1.4.tar.gz username@computername:~$ tar xvf down.php\?file\=5.1.4%2Flibxc-5.1.4.tar.gz username@computername:~$ cd libxc-5.1.4/ username@computername:~/libxc-5.1.4$ autoreconf -i --force username@computername:~/libxc-5.1.4$ ./configure FC=gfortran CC=gcc --prefix=$HOME/libxc-5.1.4 username@computername:~/libxc-5.1.4$ make username@computername:~/libxc-5.1.4$ make check username@computername:~/libxc-5.1.4$ make install *Installed OpenBLAS* username@computername:~/libxc-5.1.4$ cd ~ username@computername:~$ wget https://github.com/xianyi/OpenBLAS/releases/download/v0.3.15/OpenBLAS-0.3.15.tar.gz username@computername:~$ tar zxvf OpenBLAS-0.3.15.tar.gz username@computername:~$ cd OpenBLAS-0.3.15/ username@computername:~/OpenBLAS-0.3.15$ make FC=gfortran CC=gcc username@computername:~/OpenBLAS-0.3.15$ echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/username/OpenBLAS-0.3.15'>>~/.bashrc username@computername:~/OpenBLAS-0.3.15$ source ~/.bashrc *Installed Open MPI* username@computername:~/OpenBLAS-0.3.15$ cd ~ username@computername:~$ wget https://download.open-mpi.org/release/open-mpi/v4.1/openmpi-4.1.1.tar.gz username@computername:~$ tar xvf openmpi-4.1.1.tar.gz username@computername:~$ cd openmpi-4.1.1/ username@computername:~/openmpi-4.1.1$ ./configure --prefix=$HOME/openmpi-4.1.1 username@computername:~/openmpi-4.1.1$ make all install username@computername:~/openmpi-4.1.1$ echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/username/openmpi-4.1.1/lib'>>~/.bashrc username@computername:~/openmpi-4.1.1$ echo 'export PATH=$PATH:/home/username/openmpi-4.1.1/bin'>>~/.bashrc username@computername:~/openmpi-4.1.1$ source ~/.bashrc *Installed fftw* username@computername:~/openmpi-4.1.1$ cd ~ username@computername:~$ wget http://www.fftw.org/fftw-3.3.9.tar.gz username@computername:~$ tar xvf fftw-3.3.9.tar.gz username@computername:~$ cd fftw-3.3.9/ username@computername:~/fftw-3.3.9$ ./configure FCC=gfortran CC=gcc MPICC=mpicc --enable-mpi --prefix=$HOME/fftw-3.3.9
Re: [Wien] Parallel calculation
Let's take this step by step. There are three parallel modes, k-points, open-mp and mpi. For calculations with 20 atoms or less (roughly) you only need k-points and open-mp. Open-mp is installed as part of your compile options. Please read the user guide. For mpi you need a fast connection, and mpi installed (openmpi, Intel or other). Then it is configurable in the compile. What do you have, and what are you trying to do, your question was way too open. _ Professor Laurence Marks "Research is to see what everybody else has seen, and to think what nobody else has thought", Albert Szent-Györgyi www.numis.northwestern.edu On Tue, Jun 8, 2021, 07:21 ben amara imen wrote: > Dear > > Can someone tell me how I can install the parallel calculation for Wien2k . > Thanks in advance > > > Best regards > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > > https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!C-OO-twxSbRyWbTTGhfIhuMdXQFaBuDew1Zg_DpTawEM1okpDFC-E28pdXG0Gw9zmFDtIQ$ > SEARCH the MAILING-LIST at: > https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!C-OO-twxSbRyWbTTGhfIhuMdXQFaBuDew1Zg_DpTawEM1okpDFC-E28pdXG0Gw_I3aOv2g$ > ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] Parallel calculation
Dear Can someone tell me how I can install the parallel calculation for Wien2k . Thanks in advance Best regards ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Parallel calculation in more than 2 nodes
parallel_options is sourced by all the mpi scripts, so you don't want to edit lapw1para or any of the others. It should now work for everything. _ Professor Laurence Marks "Research is to see what everybody else has seen, and to think what nobody else has thought", Albert Szent-Gyorgi www.numis.northwestern.edu On Wed, Jul 22, 2020, 03:48 MA Weiliang wrote: > Hi, > > I changed the parallel_option with ' setenv WIEN_MPIRUN "mpirun -np _NP_ > -machinefile _HOSTS_ _EXEC_” '. It works. > > There are some lines in lapw1para_lapw, so I just tried to comment the > MPIRUN lines in parallel_options before. > > if ( $?WIEN_MPIRUN ) then > set mpirun = "$WIEN_MPIRUN" > else > set mpirun='mpirun -np _NP_ _EXEC_' > endif > > > Thank you again. > > Best regards, > Weiliang MA > > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > > https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!GHZ1WgAR-Sdo-iHSsS7NEKpXNx5ZT_FKzCg9gZ5HpDSr8fSx3wkVSLXw8Hw8Dsy71_5khA$ > SEARCH the MAILING-LIST at: > https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!GHZ1WgAR-Sdo-iHSsS7NEKpXNx5ZT_FKzCg9gZ5HpDSr8fSx3wkVSLXw8Hw8Dswps1SS2A$ > ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Parallel calculation in more than 2 nodes
Hi, I changed the parallel_option with ' setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_” '. It works. There are some lines in lapw1para_lapw, so I just tried to comment the MPIRUN lines in parallel_options before. if ( $?WIEN_MPIRUN ) then set mpirun = "$WIEN_MPIRUN" else set mpirun='mpirun -np _NP_ _EXEC_' endif Thank you again. Best regards, Weiliang MA ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Parallel calculation in more than 2 nodes
> parallel option file: # setenv WIEN_MPIRUN "srun -K1 _EXEC_" > Because of compatible issues, we don't use srun by commented the > WIEN_MPIRUN line in parallel option file and use the mpirun directly. You cannot just comment the MPIRUN variable, but if you don't want to use srun you should set it to: setenv WIEN_MPIRUN "mpirun -np _NP_ -machinefile _HOSTS_ _EXEC_" (or use during siteconfig the option of ifort (without srun)) Regards On 7/21/20 12:48 PM, MA Weiliang wrote: Dear WIEN2K users, The cluster we used is a memory shared system with 16 cpus per node. The calculation distributed in 2 nodes with 32 cpus. But actually all the mpi processes were running in the first node according to the attached top ouput. There were not processes in the second nodes. As you can see, the usage of cpu is around 50%. It seemes that the calculation didn't distribute in 2 nodes, but only splitted the fisrt node (16 cpus) into 32 prcesses with half computing power. Do you have any ideas for this problem? The .machines, wien2k info, dayfile and job output are attached below. Thank you! Best, Weiliang ## # output of top ## PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 43504 mc20 0 614m 262m 27m R 50.2 0.3 21:45.54 lapw1c_mpi 43507 mc20 0 611m 259m 26m R 50.2 0.3 21:50.76 lapw1c_mpi 43514 mc20 0 614m 255m 22m R 50.2 0.3 21:51.37 lapw1c_mpi ... 32 lines in total ... 43508 mc20 0 615m 260m 23m R 49.5 0.3 21:43.73 lapw1c_mpi 43513 mc20 0 616m 257m 22m R 49.5 0.3 21:51.32 lapw1c_mpi 43565 mc20 0 562m 265m 24m R 49.5 0.3 21:43.29 lapw1c_mpi ## # .machines file ## 1:lame26:16 1:lame28:16 lapw0: lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 dstart: lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 nlvdw: lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lapw2_vector_split:2 granularity:1 extrafine:1 ## # wien2k info ## wien2k version: 18.2 complier: ifort, icc, mpiifort (intel 2017 compliers) parallel option file: # setenv WIEN_MPIRUN "srun -K1 _EXEC_" Because of compatible issues, we don't use srun by commented the WIEN_MPIRUN line in parallel option file and use the mpirun directly. ## # dayfile ## cycle 7 (Mon Jul 20 20:56:01 CEST 2020) (194/93 to go) lapw0 -p (20:56:01) starting parallel lapw0 at Mon Jul 20 20:56:01 CEST 2020 .machine0 : 32 processors 0.087u 0.176s 0:17.87 1.3% 0+0k 0+112io 0pf+0w lapw1 -p -c (20:56:19) starting parallel lapw1 at Mon Jul 20 20:56:19 CEST 2020 -> starting parallel LAPW1 jobs at Mon Jul 20 20:56:20 CEST 2020 running LAPW1 in parallel mode (using .machines) 2 number_of_parallel_jobs lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26(16) 0.022u 0.049s 56:37.88 0.0% 0+0k 0+8io 0pf+0w lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28(16) 0.031u 0.038s 56:00.24 0.0% 0+0k 0+8io 0pf+0w Summary of lapw1para: lame26k=0 user=0 wallclock=0 lame28k=0 user=0 wallclock=0 18.849u 18.501s 56:40.85 1.0% 0+0k 0+1032io 0pf+0w lapwso -p -c(21:53:00) running LAPWSO in parallel mode lame26 0.026u 0.044s 2:20:06.55 0.0% 0+0k 0+8io 0pf+0w lame28 0.027u 0.043s 2:18:40.89 0.0% 0+0k 0+8io 0pf+0w Summary of lapwsopara: lame26user=0.026 wallclock=140 lame28user=0.027 wallclock=138 0.235u 2.621s 2:20:13.57 0.0% 0+0k 0+864io 0pf+0w lapw2 -p-c -so (00:13:14) running LAPW2 in parallel mode lame26 0.023u 0.044s 4:58.20 0.0% 0+0k 0+8io 0pf+0w lame28 0.024u 0.044s 5:02.58 0.0% 0+0k 0+8io 0pf+0w Summary of lapw2para: lame26user=0.023 wallclock=298.2 lame28user=0.024 wallclock=302.58 5.836u 1.057s 5:11.94 2.2% 0+0k 0+166184io 0pf+0w lcore (00:18:26) 1.576u 0.042s 0:02.06 78.1% 0+0k 0+12888io 0pf+0w mixer (00:18:30) 6.472u 0.687s 0:07.97 89.7% 0+0k 0+308832io 0pf+0w :ENERGY
Re: [Wien] Parallel calculation in more than 2 nodes
Parallel_option: setenv TASKSET "no" if ( ! $?USE_REMOTE ) setenv USE_REMOTE 0 if ( ! $?MPI_REMOTE ) setenv MPI_REMOTE 0 setenv WIEN_GRANULARITY 1 setenv DELAY 0.1 setenv SLEEPY 1 #setenv WIEN_MPIRUN "srun -K1 _EXEC_" if ( ! $?CORES_PER_NODE) setenv CORES_PER_NODE 16 #if ( ! $?PINNING_COMMAND) setenv PINNING_COMMAND "--cpu_bind=map_cpu:" #if ( ! $?PINNING_LIST ) setenv PINNING_LIST “0,8,1,9,2,10,3,11,4,12,5,13,6,14,7,15" mpi: mpirun in intel 2017 compliers. ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Parallel calculation in more than 2 nodes
i.e. paste the result of "cat $WIENROOT/parallel_options" into an email. _ Professor Laurence Marks "Research is to see what everybody else has seen, and to think what nobody else has thought", Albert Szent-Gyorgi www.numis.northwestern.edu On Tue, Jul 21, 2020, 07:44 Laurence Marks wrote: > What are you using in parallel_options? The statement: > "parallel option file: # setenv WIEN_MPIRUN "srun -K1 _EXEC_" > Because of compatible issues, we don't use srun by commented the > WIEN_MPIRUN line in parallel option file and use the mpirun directly." is > ambiguous. > > What mpi? > _ > Professor Laurence Marks > "Research is to see what everybody else has seen, and to think what nobody > else has thought", Albert Szent-Gyorgi > www.numis.northwestern.edu > > On Tue, Jul 21, 2020, 05:48 MA Weiliang > wrote: > >> Dear WIEN2K users, >> >> The cluster we used is a memory shared system with 16 cpus per node. The >> calculation distributed in 2 nodes with 32 cpus. But actually all the mpi >> processes were running in the first node according to the attached top >> ouput. There were not processes in the second nodes. As you can see, the >> usage of cpu is around 50%. It seemes that the calculation didn't >> distribute in 2 nodes, but only splitted the fisrt node (16 cpus) into 32 >> prcesses with half computing power. >> >> Do you have any ideas for this problem? The .machines, wien2k info, >> dayfile and job output are attached below. Thank you! >> >> Best, >> Weiliang >> >> >> ## >> # output of top >> ## >> PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND >> 43504 mc20 0 614m 262m 27m R 50.2 0.3 21:45.54 lapw1c_mpi >> 43507 mc >> >> granularity:1 >> extrafine:1 >> >> >> ## >> # wien2k info >> ## >> wien2k version: 18.2 >> complier: ifort, icc, mpiifort (intel 2017 compliers) >> parallel option file: # setenv WIEN_MPIRUN "srun -K1 _EXEC_" >> Because of compatible issues, we don't use srun by commented the >> WIEN_MPIRUN line in parallel option file and use the mpirun directly. >> >> >> ___ >> Wien mailing list >> Wien@zeus.theochem.tuwien.ac.at >> >> https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!Gk4SRyFbT0Rr2V8RvIKpCWwNVxEaIwZmJwfybYs-iLIsOTo1L_GQr62ya-ECZy_n7wJwFg$ >> SEARCH the MAILING-LIST at: >> https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!Gk4SRyFbT0Rr2V8RvIKpCWwNVxEaIwZmJwfybYs-iLIsOTo1L_GQr62ya-ECZy_rM73t9Q$ >> > ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
Re: [Wien] Parallel calculation in more than 2 nodes
What are you using in parallel_options? The statement: "parallel option file: # setenv WIEN_MPIRUN "srun -K1 _EXEC_" Because of compatible issues, we don't use srun by commented the WIEN_MPIRUN line in parallel option file and use the mpirun directly." is ambiguous. What mpi? _ Professor Laurence Marks "Research is to see what everybody else has seen, and to think what nobody else has thought", Albert Szent-Gyorgi www.numis.northwestern.edu On Tue, Jul 21, 2020, 05:48 MA Weiliang wrote: > Dear WIEN2K users, > > The cluster we used is a memory shared system with 16 cpus per node. The > calculation distributed in 2 nodes with 32 cpus. But actually all the mpi > processes were running in the first node according to the attached top > ouput. There were not processes in the second nodes. As you can see, the > usage of cpu is around 50%. It seemes that the calculation didn't > distribute in 2 nodes, but only splitted the fisrt node (16 cpus) into 32 > prcesses with half computing power. > > Do you have any ideas for this problem? The .machines, wien2k info, > dayfile and job output are attached below. Thank you! > > Best, > Weiliang > > > ## > # output of top > ## > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND > 43504 mc20 0 614m 262m 27m R 50.2 0.3 21:45.54 lapw1c_mpi > 43507 mc > > granularity:1 > extrafine:1 > > > ## > # wien2k info > ## > wien2k version: 18.2 > complier: ifort, icc, mpiifort (intel 2017 compliers) > parallel option file: # setenv WIEN_MPIRUN "srun -K1 _EXEC_" > Because of compatible issues, we don't use srun by commented the > WIEN_MPIRUN line in parallel option file and use the mpirun directly. > > > ___ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > > https://urldefense.com/v3/__http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien__;!!Dq0X2DkFhyF93HkjWTBQKhk!Gk4SRyFbT0Rr2V8RvIKpCWwNVxEaIwZmJwfybYs-iLIsOTo1L_GQr62ya-ECZy_n7wJwFg$ > SEARCH the MAILING-LIST at: > https://urldefense.com/v3/__http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html__;!!Dq0X2DkFhyF93HkjWTBQKhk!Gk4SRyFbT0Rr2V8RvIKpCWwNVxEaIwZmJwfybYs-iLIsOTo1L_GQr62ya-ECZy_rM73t9Q$ > ___ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html
[Wien] Parallel calculation in more than 2 nodes
Dear WIEN2K users, The cluster we used is a memory shared system with 16 cpus per node. The calculation distributed in 2 nodes with 32 cpus. But actually all the mpi processes were running in the first node according to the attached top ouput. There were not processes in the second nodes. As you can see, the usage of cpu is around 50%. It seemes that the calculation didn't distribute in 2 nodes, but only splitted the fisrt node (16 cpus) into 32 prcesses with half computing power. Do you have any ideas for this problem? The .machines, wien2k info, dayfile and job output are attached below. Thank you! Best, Weiliang ## # output of top ## PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 43504 mc20 0 614m 262m 27m R 50.2 0.3 21:45.54 lapw1c_mpi 43507 mc20 0 611m 259m 26m R 50.2 0.3 21:50.76 lapw1c_mpi 43514 mc20 0 614m 255m 22m R 50.2 0.3 21:51.37 lapw1c_mpi ... 32 lines in total ... 43508 mc20 0 615m 260m 23m R 49.5 0.3 21:43.73 lapw1c_mpi 43513 mc20 0 616m 257m 22m R 49.5 0.3 21:51.32 lapw1c_mpi 43565 mc20 0 562m 265m 24m R 49.5 0.3 21:43.29 lapw1c_mpi ## # .machines file ## 1:lame26:16 1:lame28:16 lapw0: lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 dstart: lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 nlvdw: lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lapw2_vector_split:2 granularity:1 extrafine:1 ## # wien2k info ## wien2k version: 18.2 complier: ifort, icc, mpiifort (intel 2017 compliers) parallel option file: # setenv WIEN_MPIRUN "srun -K1 _EXEC_" Because of compatible issues, we don't use srun by commented the WIEN_MPIRUN line in parallel option file and use the mpirun directly. ## # dayfile ## cycle 7 (Mon Jul 20 20:56:01 CEST 2020) (194/93 to go) > lapw0 -p (20:56:01) starting parallel lapw0 at Mon Jul 20 20:56:01 CEST > 2020 .machine0 : 32 processors 0.087u 0.176s 0:17.87 1.3% 0+0k 0+112io 0pf+0w > lapw1 -p -c (20:56:19) starting parallel lapw1 at Mon Jul 20 > 20:56:19 CEST 2020 -> starting parallel LAPW1 jobs at Mon Jul 20 20:56:20 CEST 2020 running LAPW1 in parallel mode (using .machines) 2 number_of_parallel_jobs lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26 lame26(16) 0.022u 0.049s 56:37.88 0.0% 0+0k 0+8io 0pf+0w lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28 lame28(16) 0.031u 0.038s 56:00.24 0.0% 0+0k 0+8io 0pf+0w Summary of lapw1para: lame26k=0 user=0 wallclock=0 lame28k=0 user=0 wallclock=0 18.849u 18.501s 56:40.85 1.0% 0+0k 0+1032io 0pf+0w > lapwso -p -c(21:53:00) running LAPWSO in parallel mode lame26 0.026u 0.044s 2:20:06.55 0.0% 0+0k 0+8io 0pf+0w lame28 0.027u 0.043s 2:18:40.89 0.0% 0+0k 0+8io 0pf+0w Summary of lapwsopara: lame26user=0.026 wallclock=140 lame28user=0.027 wallclock=138 0.235u 2.621s 2:20:13.57 0.0% 0+0k 0+864io 0pf+0w > lapw2 -p-c -so (00:13:14) running LAPW2 in parallel mode lame26 0.023u 0.044s 4:58.20 0.0% 0+0k 0+8io 0pf+0w lame28 0.024u 0.044s 5:02.58 0.0% 0+0k 0+8io 0pf+0w Summary of lapw2para: lame26user=0.023 wallclock=298.2 lame28user=0.024 wallclock=302.58 5.836u 1.057s 5:11.94 2.2% 0+0k 0+166184io 0pf+0w > lcore (00:18:26) 1.576u 0.042s 0:02.06 78.1% 0+0k 0+12888io 0pf+0w > mixer (00:18:30) 6.472u 0.687s 0:07.97 89.7% 0+0k 0+308832io 0pf+0w :ENERGY convergence: 0 0.05 .000121525000 :CHARGE convergence: 0 0.5 .0002538 ec cc and fc_conv 0 0 1 ## # job output ## in cycle 3ETEST: .52305136 CTEST: .0049036 LAPW0 END [1]Done mpirun -np 32 /home/mcs/work/wma/Package/wien2k.18m/lapw0_mpi lapw0.def >> .time00 LAPW1 END [1] - Done
Re: [Wien] Parallel calculation -stop of iterations after mixer. :ENE NaN value
A NaN means that something has gone wrong. While this may show up with the mixer, it probably occurs somewhere else. When I look at your struct file, the Mg and O atoms at the interface have low BVS, indicating that there will be a significant contraction along your z axis. You picked RMTs for the initial structure with the spheres almost touching or even touching -- they are smaller than setrmt recommends. I strongly suspect that your calculation is failing because you have touching spheres. The simplest thing to do is reduce the RMTs, for instance use "setrmt case -r 5". With the previous converged calculation you might be able to use "reduce_rmt_lapw -r 5 -sp". However, since your initial RMTs were too large this might not work. Make sure that you do not do this with densities where you have a NaN. This may not resolve the problem. I am 99% certain you won't get anywhere with what you sent with your current RMTs On Sun, Oct 28, 2018 at 8:36 AM Coriolan TIUSAN < coriolan.tiu...@phys.utcluj.ro> wrote: > Dear wien2k users, > > I am running wien 18.2 on Ubuntu 18.04 , installed on a HP station: > 64GB, Intel® Xeon(R) Gold 5118 CPU @ 2.30GHz × 48. > > The fortran compiler/math library are ifc and intel mkl library. For > parallel execution I have MPI+SCALAPACK, FFTW. > > When calculating multilayered structures (e.g. V(3ML)/Fe(3ML)/MgO(3ML) > in supercell slab model (attached structure file), after a certain > number of iterations towards the convergence, I have a sudden stop, > without any error (in error files) after the mixer. > > I have similar problems for similar heterostructures: Au/Fe/MgO, but > only when trying to do geometrical optimization (force minimisation). > The initial scf calculation for Au/Fe/MgO slab converged (runsp_lapw -cc > 0.001, -ec 0.0001 -p) without any errors. I guess that, this would > demonstrate that the programs have been properly installed and functional? > > Coming back to the stop in scf for V/Fe/MgO, the following things can be > remarked: > > 1/ For the last iteration case.scf file containes NaN values for :DIS > and :ENE, > > :DEN : DENSITY INTEGRAL =NaN (Ry) > > :ENE : *WARNING** TOTAL ENERGY IN Ry = NaN > > 2/ The dayfile displays: > > LAPW0 END > [1]Done mpirun -np 48 -machinefile > .machine0 /home/tiusan/W2k/lapw0_mpi lapw0.def >> .time00 > LAPW1 END > [1] + Done ( cd $PWD; $t $ttt; rm -f > .lock_$lockfile[$p] ) >> .time1_$loop > LAPW1 END > [1] + Done ( cd $PWD; $t $ttt; rm -f > .lock_$lockfile[$p] ) >> .time1_$loop > LAPW2 - FERMI; weights written > LAPW2 END > [1]Done ( cd $PWD; $t $ttt $vector_split; > rm -f .lock_$lockfile[$p] ) >> .time2_$loop > SUMPARA END > LAPW2 - FERMI; weights written > LAPW2 END > [1]Done ( cd $PWD; $t $ttt $vector_split; > rm -f .lock_$lockfile[$p] ) >> .time2_$loop > SUMPARA END > CORE END > CORE END > MIXER END > (standard_in) 1: syntax error > (standard_in) 1: syntax error > (standard_in) 1: syntax error > (standard_in) 1: syntax error > (standard_in) 1: syntax error > etest: Subscript out of range. > > 3/ Following some remarks from the frum, I have added sleep 1 in the > testconv script, without any success... > > Could someone help? > > With thanks in advance, > > Coriolan TIUSAN > > > > -- > __ > | Prof. Dr. Eng. Habil. Coriolan Viorel TIUSAN | > |--- | > | | > | Department of Physics and Chemistry | > | Technical University of Cluj-Napoca | > | | > | Center of Superconductivity, Spintronics and Surface Science| > | Str. Memorandumului No. 28, RO-400114 Cluj-Napoca, ROMANIA | > | | > | Tel: +40-264-401-465 Fax: +40-264-592-055 | > |Cell: +40-732-893-750| > | e-mail: coriolan.tiu...@phys.utcluj.ro| > | web: > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.c4s.utcluj.ro_=DwIDaQ=yHlS04HhBraes5BQ9ueu5zKhE7rtNXt_d012z2PA6ws=U_T4PL6jwANfAy4rnxTj8IUxm818jnvqKFdqWLwmqg0=gK1X21ldrBrldjPvKy6UgGo_7lxghPbF8Yr2tJYqacU=tkRhrdmNBQuLJiyt8vPw_QlJD2SiWa8vcPMREiRgPh4= > | > |___ | > | | > | Senior Researcher | > | National Center of Scientific Research - FRANCE| > | e-mail: coriolan.tiu...@ijl.nancy-universite.fr
[Wien] Parallel calculation -stop of iterations after mixer. :ENE NaN value
Dear wien2k users, I am running wien 18.2 on Ubuntu 18.04 , installed on a HP station: 64GB, Intel® Xeon(R) Gold 5118 CPU @ 2.30GHz × 48. The fortran compiler/math library are ifc and intel mkl library. For parallel execution I have MPI+SCALAPACK, FFTW. When calculating multilayered structures (e.g. V(3ML)/Fe(3ML)/MgO(3ML) in supercell slab model (attached structure file), after a certain number of iterations towards the convergence, I have a sudden stop, without any error (in error files) after the mixer. I have similar problems for similar heterostructures: Au/Fe/MgO, but only when trying to do geometrical optimization (force minimisation). The initial scf calculation for Au/Fe/MgO slab converged (runsp_lapw -cc 0.001, -ec 0.0001 -p) without any errors. I guess that, this would demonstrate that the programs have been properly installed and functional? Coming back to the stop in scf for V/Fe/MgO, the following things can be remarked: 1/ For the last iteration case.scf file containes NaN values for :DIS and :ENE, :DEN : DENSITY INTEGRAL = NaN (Ry) :ENE : *WARNING** TOTAL ENERGY IN Ry = NaN 2/ The dayfile displays: LAPW0 END [1] Done mpirun -np 48 -machinefile .machine0 /home/tiusan/W2k/lapw0_mpi lapw0.def >> .time00 LAPW1 END [1] + Done ( cd $PWD; $t $ttt; rm -f .lock_$lockfile[$p] ) >> .time1_$loop LAPW1 END [1] + Done ( cd $PWD; $t $ttt; rm -f .lock_$lockfile[$p] ) >> .time1_$loop LAPW2 - FERMI; weights written LAPW2 END [1] Done ( cd $PWD; $t $ttt $vector_split; rm -f .lock_$lockfile[$p] ) >> .time2_$loop SUMPARA END LAPW2 - FERMI; weights written LAPW2 END [1] Done ( cd $PWD; $t $ttt $vector_split; rm -f .lock_$lockfile[$p] ) >> .time2_$loop SUMPARA END CORE END CORE END MIXER END (standard_in) 1: syntax error (standard_in) 1: syntax error (standard_in) 1: syntax error (standard_in) 1: syntax error (standard_in) 1: syntax error etest: Subscript out of range. 3/ Following some remarks from the frum, I have added sleep 1 in the testconv script, without any success... Could someone help? With thanks in advance, Coriolan TIUSAN -- __ | Prof. Dr. Eng. Habil. Coriolan Viorel TIUSAN | |--- | | | | Department of Physics and Chemistry | | Technical University of Cluj-Napoca | | | | Center of Superconductivity, Spintronics and Surface Science| | Str. Memorandumului No. 28, RO-400114 Cluj-Napoca, ROMANIA | | | | Tel: +40-264-401-465 Fax: +40-264-592-055 | |Cell: +40-732-893-750| | e-mail: coriolan.tiu...@phys.utcluj.ro| | web: http://www.c4s.utcluj.ro/| |___ | | | | Senior Researcher | | National Center of Scientific Research - FRANCE| | e-mail: coriolan.tiu...@ijl.nancy-universite.fr| | web: http://www.c4s.utcluj.ro/webperso/tiusan/welcome.html | |_| VFeMgO P LATTICE,NONEQUIV.ATOMS: 13 MODE OF CALC=RELA unit=ang 5.725872 5.725872 33.391474 90.00 90.00 90.00 ATOM -1: X=0. Y=0. Z=0. MULT= 1 ISPLIT=-2 V 1NPT= 781 R0=0.5000 RMT=2.0800 Z: 23.0 LOCAL ROT MATRIX:1.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 1.000 ATOM -2: X=0.5000 Y=0.5000 Z=0.08573854 MULT= 1 ISPLIT=-2 V 2NPT= 781 R0=0.5000 RMT=2.0800 Z: 23.0 LOCAL ROT MATRIX:1.000 0.000 0.000 0.000 1.000 0.000 0.000 0.000 1.000 ATOM -3: X=0. Y=0. Z=0.17147708 MULT= 1 ISPLIT=-2 V 3NPT= 781 R0=0.5000 RMT=2.0800 Z: 23.0 LOCAL ROT MATRIX:1.000 0.000 0.000 0.000 1.000 0.000 0.000
[Wien] parallel calculation with 1 kpoint
Via the mpi-parallel option. You need a corresponding hardware (MORE than just 4 cores, eg. a cluster with infiniband), and the corresponding mpi-software. Read the UG for more details. Am 16.01.2012 13:28, schrieb ali ghafari: Dear Wien2k users How can I do parallel calculations with 1 kpoint? Best wishes Ali ___ Wien mailing list Wien at zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien -- P.Blaha -- Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna Phone: +43-1-58801-165300 FAX: +43-1-58801-165982 Email: blaha at theochem.tuwien.ac.atWWW: http://info.tuwien.ac.at/theochem/ --
[Wien] parallel calculation with 1 kpoint
Dear Wien2k users How can I do parallel calculations with 1 kpoint? Best wishes Ali -- next part -- An HTML attachment was scrubbed... URL: http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20120116/efc9ac53/attachment.htm
[Wien] Parallel calculation with Dual Quad Core
Hi Gerhard, Thank you very much. I just changed the environment variables OMP_NUM_THREADS = 8 in bashrc, so all processors are used. thanks, Chao -- next part -- An HTML attachment was scrubbed... URL: http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20100113/8789c12b/attachment.htm
[Wien] Parallel calculation with Dual Quad Core Processors
Dear Prof. Marks, Thanks very much for your suggestions. Now, I know how to set up the parallel calculation and edit .machines. In addition, I have another question about the performance of my machine with Dual Quad Core Processors (i. e. 8 CPUs). By checking the process, I found eight CPUs of my machine were all used by more than 90%, even when I performed the calculation without k-point parallelization. I have compared these two cases with and without k-point parallelization, and found that they almost took the same time. So, I think it is not necessary to perform k-point parallel calculation in my case, and that this workstation will automatically allocate the mission on average to all eight CPUs . Is my understanding right? Thanks, Best regards, Chao -- next part -- An HTML attachment was scrubbed... URL: http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20100110/86489b71/attachment.htm
[Wien] Parallel calculation with Dual Quad Core Processors
Do you use the serial or the parallel version of the MKL ? The MKL that is used for matrix operations is internally parllelized, the behaviour is usually controlled by the environment variables OMP_NUM_THREADS or MKL_NUM_THREADS as well as some other MKL rlated ones (check the MKL manual). The use of the parallel MKL may therefore cause that all of youre cores are used. Usually the Wien2k config script sets OMP_NUM_THREADS=1 in the bash.rc, If you do not figure out what was setting the thread number for the MKL, try to force that Wien2k uses the serial version (check the compiler and linker options in the Intel manuals) Indeed the use of sveral threads for the mkl and k-point parallelization may be contraproductive and can even slow down the program Note if you change the number of threads using OMP_NUM_THREADS=1, you may need to restart W2WEB, otherwise it stays with the old value. Ciao Gerhard Dr. Gerhard H. Fecher Institut of Inorganic and Analytical Chemistry Johannes Gutenberg - University 55099 Mainz Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at zeus.theochem.tuwien.ac.at] im Auftrag von ?? [cma at blem.ac.cn] Gesendet: Sonntag, 10. Januar 2010 00:32 An: wien at zeus.theochem.tuwien.ac.at Betreff: Re: [Wien] Parallel calculation with Dual Quad Core Processors Dear Prof. Marks, Thanks very much for your suggestions. Now, I know how to set up the parallel calculation and edit .machines. In addition, I have another question about the performance of my machine with Dual Quad Core Processors (i. e. 8 CPUs). By checking the process, I found eight CPUs of my machine were all used by more than 90%, even when I performed the calculation without k-point parallelization. I have compared these two cases with and without k-point parallelization, and found that they almost took the same time. So, I think it is not necessary to perform k-point parallel calculation in my case, and that this workstation will automatically allocate the mission on average to all eight CPUs . Is my understanding right? Thanks, Best regards, Chao
[Wien] Parallel calculation with Dual Quad Core Processors
Dear All, I installed Wien2k_09.2 in a workstation with Dual Quad Core Intel Xeon Processors. For parallel calculation, can the system automatically allocate the k-points to two processors, or, should I edit .machines? If so, How to edit .machines? Any suggestion is greatly appreciated. Best regards, Chao Ma -- next part -- An HTML attachment was scrubbed... URL: http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20100105/f6333046/attachment.htm
[Wien] Parallel calculation with Dual Quad Core Processors
You have to edit .machines -- please read the UG. 2010/1/4 ?? cma at blem.ac.cn: Dear All, I installed Wien2k_09.2 in?a workstation with Dual Quad Core Intel Xeon Processors. For parallel calculation, can the system automatically allocate the k-points to two processors, or, should I edit .machines? If so, How to edit .machines? Any suggestion is greatly appreciated. Best regards, Chao Ma ___ Wien mailing list Wien at zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien -- Laurence Marks Department of Materials Science and Engineering MSE Rm 2036 Cook Hall 2220 N Campus Drive Northwestern University Evanston, IL 60208, USA Tel: (847) 491-3996 Fax: (847) 491-7820 email: L-marks at northwestern dot edu Web: www.numis.northwestern.edu Chair, Commission on Electron Crystallography of IUCR www.numis.northwestern.edu/ Electron crystallography is the branch of science that uses electron scattering and imaging to study the structure of matter.
[Wien] parallel calculation stops waiting for a pass in ssh root@localhost
Dear Wien users I have some problems in using parallelization over kpoints in the latest release of WIEN2k. It is important to mention that I had no troubles in the past with my settings. In particular I am working with dual-core pc desktops, and one Quad-core machines, individually. For instance, in a simple parallel setup, the typical test_case, I found this: - root at rfaccio:/home/rfaccio/calculo/wien2k/test_case# x lapw1 -c -up -p starting parallel lapw1 at Thu Oct 1 23:32:56 UYT 2009 - starting parallel LAPW1 jobs at Thu Oct 1 23:32:56 UYT 2009 running LAPW1 in parallel mode (using .machines) 2 number_of_parallel_jobs [1] 30629 root at localhost's password: - As you can see the code keeps waiting for my password. It never happened before, and it seems to have no relationship with the shared memory setup because I used the same flags than in previous release versions. I use this .machine file: 1:localhost 1:localhost granularity:1 extrafine:1 I would like to know if there exist an easy way to fix it, because I would like to avoid any important modification in my security pc's stuff. Is it a new modification? or a strictly problem of the user (me!) Thanks in advance Ricardo p.s.: this is my OPTION file current:FOPT:-FR -O3 -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML current:FPOPT:$(FOPT) current:LDFLAGS:$(FOPT) -L/opt/intel/Compiler/11.0/081/mkl/lib/em64t -pthread -i-static current:DPARALLEL:'-DParallel' current:R_LIBS:-lmkl_lapack -lmkl -lguide -lmkl_scalapack -lmkl_blacs_lp64 current:RP_LIBS:-lmkl_intel_lp64 -lmkl_scalapack_lp64 -lmkl_blacs_lp64 -lmkl_sequential -L /opt/local/fftw/lib/ -lfftw_mpi -lfftw current:MPIRUN: - Dr. Ricardo Faccio Prof. Adjunto de F?sica Mail: Cryssmat-Lab., C?tedra de F?sica, DETEMA Facultad de Qu?mica, Universidad de la Rep?blica Av. Gral. Flores 2124, C.C. 1157 C.P. 11800, Montevideo, Uruguay. E-mail: rfaccio at fq.edu.uy Phone: 598 2 924 98 59 598 2 929 06 48 Fax:598 2 9241906 Web: http://cryssmat.fq.edu.uy/ricardo/ricardo.htm - -- next part -- An HTML attachment was scrubbed... URL: http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20091001/260760ae/attachment.htm
[Wien] parallel calculation stops waiting for a pass in ssh root@localhost
You should check the permission of .ssh and files therein. For instance:drwx-- .ssh -rw-r--r-- id_rsa.pub -rw--- id_rsa -rw-r--r-- authorized_keys -rw-r--r-- known_hosts If it still doesn't work, open authorized_keys file and delete lines related to your host. Then use ssh-keygen to generate new key without password, then cat id_rsa.pub authorized_keys. Good luck. 2009/10/1 Ricardo Faccio rfaccio at fq.edu.uy Dear Wien users I have some problems in using parallelization over kpoints in the latest release of WIEN2k. It is important to mention that I had no troubles in the past with my settings. In particular I am working with dual-core pc desktops, and one Quad-core machines, individually. For instance, in a simple parallel setup, the typical test_case, I found this: - root at rfaccio:/home/rfaccio/calculo/wien2k/test_case# x lapw1 -c -up -p starting parallel lapw1 at Thu Oct 1 23:32:56 UYT 2009 - starting parallel LAPW1 jobs at Thu Oct 1 23:32:56 UYT 2009 running LAPW1 in parallel mode (using .machines) 2 number_of_parallel_jobs [1] 30629 root at localhost's password: - As you can see the code keeps waiting for my password. It never happened before, and it seems to have no relationship with the shared memory setup because I used the same flags than in previous release versions. I use this .machine file: 1:localhost 1:localhost granularity:1 extrafine:1 I would like to know if there exist an easy way to fix it, because I would like to avoid any important modification in my security pc's stuff. Is it a new modification? or a strictly problem of the user (me!) Thanks in advance Ricardo p.s.: this is my OPTION file current:FOPT:-FR -O3 -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML current:FPOPT:$(FOPT) current:LDFLAGS:$(FOPT) -L/opt/intel/Compiler/11.0/081/mkl/lib/em64t -pthread -i-static current:DPARALLEL:'-DParallel' current:R_LIBS:-lmkl_lapack -lmkl -lguide -lmkl_scalapack -lmkl_blacs_lp64 current:RP_LIBS:-lmkl_intel_lp64 -lmkl_scalapack_lp64 -lmkl_blacs_lp64 -lmkl_sequential -L /opt/local/fftw/lib/ -lfftw_mpi -lfftw current:MPIRUN: - Dr. Ricardo Faccio Prof. Adjunto de F?sica Mail: Cryssmat-Lab., C?tedra de F?sica, DETEMA Facultad de Qu?mica, Universidad de la Rep?blica Av. Gral. Flores 2124, C.C. 1157 C.P. 11800, Montevideo, Uruguay. E-mail: rfaccio at fq.edu.uy Phone: 598 2 924 98 59 598 2 929 06 48 Fax:598 2 9241906 Web: http://cryssmat.fq.edu.uy/ricardo/ricardo.htm - ___ Wien mailing list Wien at zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien -- -- Duy Le PhD Student Department of Physics University of Central Florida. -- next part -- An HTML attachment was scrubbed... URL: http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20091002/8385fde2/attachment.htm
[Wien] parallel calculation stops waiting for a pass in ssh root@localhost
First about security: You should NOT run ieN2k as user root. root is the system administrator, and one should be root ONLY when doing sysadmin tasks. Create a regular user and run normal work as restricted user. k-parallelization has 2 options during siteconfig: shared memory machine: This does NOT use ssh to span the parallel jobs, but simply starts eg. 4 lapw1 jobs (in background) on the local machine. This is the most simple option when you have eg. a single quadcore PC, but you cannot parallelize between 2 different PCs. no shared memory: It will use a command like ssh pcX lapw0 lapw0.def ... to start a job on a remote but also on the local computer. Thus you need to be able to do passwordless ssh to your local computer. Try ssh localhostand you must be able to login WITHOUT typing a password. For setup of this you must use ssh-keygen, for details see eg. the UG. Ricardo Faccio schrieb: Dear Wien users I have some problems in using parallelization over kpoints in the latest release of WIEN2k. It is important to mention that I had no troubles in the past with my settings. In particular I am working with dual-core pc desktops, and one Quad-core machines, individually. For instance, in a simple parallel setup, the typical test_case, I found this: - root at rfaccio:/home/rfaccio/calculo/wien2k/test_case# x lapw1 -c -up -p starting parallel lapw1 at Thu Oct 1 23:32:56 UYT 2009 - starting parallel LAPW1 jobs at Thu Oct 1 23:32:56 UYT 2009 running LAPW1 in parallel mode (using .machines) 2 number_of_parallel_jobs [1] 30629 root at localhost's password: - As you can see the code keeps waiting for my password. It never happened before, and it seems to have no relationship with the shared memory setup because I used the same flags than in previous release versions. I use this .machine file: 1:localhost 1:localhost granularity:1 extrafine:1 I would like to know if there exist an easy way to fix it, because I would like to avoid any important modification in my security pc's stuff. Is it a new modification? or a strictly problem of the user (me!) Thanks in advance Ricardo p.s.: this is my OPTION file current:FOPT:-FR -O3 -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML current:FPOPT:$(FOPT) current:LDFLAGS:$(FOPT) -L/opt/intel/Compiler/11.0/081/mkl/lib/em64t -pthread -i-static current:DPARALLEL:'-DParallel' current:R_LIBS:-lmkl_lapack -lmkl -lguide -lmkl_scalapack -lmkl_blacs_lp64 current:RP_LIBS:-lmkl_intel_lp64 -lmkl_scalapack_lp64 -lmkl_blacs_lp64 -lmkl_sequential -L /opt/local/fftw/lib/ -lfftw_mpi -lfftw current:MPIRUN: - Dr. Ricardo Faccio Prof. Adjunto de F?sica Mail: Cryssmat-Lab., C?tedra de F?sica, DETEMA Facultad de Qu?mica, Universidad de la Rep?blica Av. Gral. Flores 2124, C.C. 1157 C.P. 11800, Montevideo, Uruguay. E-mail: rfaccio at fq.edu.uy mailto:rfaccio at fq.edu.uy Phone: 598 2 924 98 59 598 2 929 06 48 Fax:598 2 9241906 Web: http://cryssmat.fq.edu.uy/ricardo/ricardo.htm - ___ Wien mailing list Wien at zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien -- - Peter Blaha Inst. Materials Chemistry, TU Vienna Getreidemarkt 9, A-1060 Vienna, Austria Tel: +43-1-5880115671 Fax: +43-1-5880115698 email: pblaha at theochem.tuwien.ac.at -
[Wien] parallel calculation stops waiting for a pass inssh root@localhost
Dear Prof. Blaha and Duy Le Thanks for your help, but I found modifying lapw1cpara and lapw1para scripts by hand, they can work properly for me. # #In this section use 0 to turn of an option, 1 to turn it on, #respectively choose a value set useremote = 0 # using remote shell to launch processes set mpiremote = 1 # using remote shell to launch mpi set delay = 1 # delay launching of processes by n seconds set sleepy = 1 # additional sleep before checking set debug = 0 # verbosity of debugging output # Thanks once again Ricardo - Dr. Ricardo Faccio Prof. Adjunto de F?sica Mail: Cryssmat-Lab., C?tedra de F?sica, DETEMA Facultad de Qu?mica, Universidad de la Rep?blica Av. Gral. Flores 2124, C.C. 1157 C.P. 11800, Montevideo, Uruguay. E-mail: rfaccio at fq.edu.uy Phone: 598 2 924 98 59 598 2 929 06 48 Fax:598 2 9241906 Web: http://cryssmat.fq.edu.uy/ricardo/ricardo.htm - - Original Message - From: Peter Blaha pbl...@theochem.tuwien.ac.at To: A Mailing list for WIEN2k users wien at zeus.theochem.tuwien.ac.at Sent: Friday, October 02, 2009 3:40 AM Subject: Re: [Wien] parallel calculation stops waiting for a pass inssh root at localhost First about security: You should NOT run ieN2k as user root. root is the system administrator, and one should be root ONLY when doing sysadmin tasks. Create a regular user and run normal work as restricted user. k-parallelization has 2 options during siteconfig: shared memory machine: This does NOT use ssh to span the parallel jobs, but simply starts eg. 4 lapw1 jobs (in background) on the local machine. This is the most simple option when you have eg. a single quadcore PC, but you cannot parallelize between 2 different PCs. no shared memory: It will use a command like ssh pcX lapw0 lapw0.def ... to start a job on a remote but also on the local computer. Thus you need to be able to do passwordless ssh to your local computer. Try ssh localhostand you must be able to login WITHOUT typing a password. For setup of this you must use ssh-keygen, for details see eg. the UG. Ricardo Faccio schrieb: Dear Wien users I have some problems in using parallelization over kpoints in the latest release of WIEN2k. It is important to mention that I had no troubles in the past with my settings. In particular I am working with dual-core pc desktops, and one Quad-core machines, individually. For instance, in a simple parallel setup, the typical test_case, I found this: - root at rfaccio:/home/rfaccio/calculo/wien2k/test_case# x lapw1 -c -up -p starting parallel lapw1 at Thu Oct 1 23:32:56 UYT 2009 - starting parallel LAPW1 jobs at Thu Oct 1 23:32:56 UYT 2009 running LAPW1 in parallel mode (using .machines) 2 number_of_parallel_jobs [1] 30629 root at localhost's password: - As you can see the code keeps waiting for my password. It never happened before, and it seems to have no relationship with the shared memory setup because I used the same flags than in previous release versions. I use this .machine file: 1:localhost 1:localhost granularity:1 extrafine:1 I would like to know if there exist an easy way to fix it, because I would like to avoid any important modification in my security pc's stuff. Is it a new modification? or a strictly problem of the user (me!) Thanks in advance Ricardo p.s.: this is my OPTION file current:FOPT:-FR -O3 -mp1 -w -prec_div -pc80 -pad -align -DINTEL_VML current:FPOPT:$(FOPT) current:LDFLAGS:$(FOPT) -L/opt/intel/Compiler/11.0/081/mkl/lib/em64t -pthread -i-static current:DPARALLEL:'-DParallel' current:R_LIBS:-lmkl_lapack -lmkl -lguide -lmkl_scalapack -lmkl_blacs_lp64 current:RP_LIBS:-lmkl_intel_lp64 -lmkl_scalapack_lp64 -lmkl_blacs_lp64 -lmkl_sequential -L /opt/local/fftw/lib/ -lfftw_mpi -lfftw current:MPIRUN: - Dr. Ricardo Faccio Prof. Adjunto de F?sica Mail: Cryssmat-Lab., C?tedra de F?sica, DETEMA Facultad de Qu?mica, Universidad de la Rep?blica Av. Gral. Flores 2124, C.C. 1157 C.P. 11800, Montevideo, Uruguay. E-mail: rfaccio at fq.edu.uy mailto:rfaccio at fq.edu.uy Phone: 598 2 924 98 59 598 2 929 06 48 Fax:598 2 9241906 Web: http://cryssmat.fq.edu.uy/ricardo/ricardo.htm