Of course, the syntax is the same for .machines and .machines_x

optimize_abc uses the following strategy:

scf-cycle for a0b0c0  ,   uses  .machines

parallel calculation of 9 cases, namely
a0+-delta,b0,c0   (2 sequential scf cycles each)
a0,b0+-delta,c0
.....

Each of these 9 cases uses its own .machines_x file (because they run in parallel, these .machines_x files should use different cores.

So if you have 256 nodes, use in

.machines as many nodes as is efficient (depends on k-list and cpu-time, note that more cores does NOT always mean shorter run time. You have to find out the optimum from case.dayfile). In many cases it could be that you should NOT use all cores.

.machines_x     distribute the cores to these 9 files.

Some more general hints:
how many atoms do you have in the unit cell ? a first guess is to use

lapw0:node1:XX node2:XX     where 2*XX is equal or a bit larger than NAT
                            why only 4 cores in your example ??

lapw1 was running almost 18 hours ? If this is true, massive parallelization should be possible. So how many k-points are you using ?
What says :rkm  (matrix size).
Depending on this, 16 k-parallelization with 16 mpi-cores each (as in your example) might be possible in .machines, but consider: if you have 17 k-points, 16 k-parallel job would be VERY inefficient....

For the 9  .machines_x  files, you have to split the 256 cores.
One possibility is to use 2 k-parallel lines with 16 mpi each. This would lead to a slight overload (18*16 cores necessary) - if your system allows it, it is ok, but some queuing systems may complain !!!, or use only 1 k-parallel job, but with eg. 25 mpi-cores (using 9*25 cores), .....

If the sequential lapw1 took 18h, a good parallelization on an efficient 256 core machine should bring a factor of 150-200, 5-10 minutes for lapw1 on the a0b0c0 case and 9 times longer for the other parallel cases.

I hope you have ELPA !!!!

Hope this helps and you can do it in 24 h (otherwise reduce the number of k-points and use more mpi-cores ....)

Peter

Am 25.02.2022 um 21:23 schrieb pboulet:
Dear Peter,

Well, now with the command:
optimize_abc_lapw -t 3 -n 1 -p -j "run_lapw -p -ec 0.0001 -cc 0.001 »

I have the weird behaviour that each program seems to be executed on one core only! I asked for 2 nodes/256 cores. In the case.dayfile I can read:
lapw0   -p  (21:53:37) running lapw0 in single mode
lapw1  -p           (22:02:30) running lapw1 in single mode
lapw2 -p            (15:47:14) running in single mode

I guess the problem is because I have not created the .machines_1..9 files.
But if so, what should these files contain? The same as .machines (see post scriptum below)?

Best
Pascal

PS. Here is the content of the .machines file:
# OMP parallelization
omp_global:1
#omp_lapw1:1
#omp_lapw2:1
#omp_lapwso:1
#omp_dstart:1
#omp_sumpara:1
#omp_nlvdw:1

# k-point parallelization for lapw1/2 hf lapwso qtl irrep  nmr  optic
1:irene4274:16
1:irene4274:16
1:irene4274:16
1:irene4274:16
1:irene4274:16
1:irene4274:16
1:irene4274:16
1:irene4274:16
1:irene4305:16
1:irene4305:16
1:irene4305:16
1:irene4305:16
1:irene4305:16
1:irene4305:16
1:irene4305:16
1:irene4305:16

# MPI parallelization for dstart lapw0 nlvdw
dstart: irene4274:4
lapw0: irene4274:4
nlvdw: irene4274:4

granularity:1
extrafine:1


Le 24 févr. 2022 à 18:20, Peter Blaha <pbl...@theochem.tuwien.ac.at <mailto:pbl...@theochem.tuwien.ac.at>> a écrit :

What you need is always to finish a complete "step" (19 scf cycles, which can be done highly parallel).

optimize_abc  -n 1  .....

would do this. This command can be repeated until you find convergence.
(If it crashes after more than 1 step, you can still continue, but all calculations after a full step will be lost. If you "see" that a step has finished (parabol_fit done) but it is clear that another one will not within the time limit, you can kill the whole job and submit another one.

Regards
Peter Blaha

Am 24.02.2022 um 14:08 schrieb pboulet:
Dear all,
I am optimizing an orthorhombic structure with optimize_abc (wien2k_21). As the structure is big I suspect the job will not finish before the queue reaches the CPU time limit of 24 hours.
The command I use is:
optimize_abc_lapw -t 3 -p -j "run_lapw -p -ec 0.0001 -cc 0.001 -fc 1.0 -min" Can I continue the run properly with optimize_abc_lapw when the queue will stop the job, for instance with the command: optimize_abc_lapw -t 3 -p -j "run_lapw -p -ec 0.0001 -cc 0.001 -fc 1.0 -min"  ?
If not, is there a way?
Thank you for your hints,
Best regards
Pascal
Pascal Boulet
—
/Professor in computational materials chemistry - DEPARTMENT OF CHEMISTRY/ University of Aix-Marseille - Avenue Escadrille Normandie Niemen - F-13013 Marseille - FRANCE
Tél: +33(0)4 13 55 18 10 - Fax : +33(0)4 13 55 18 50
Email :pascal.bou...@univ-amu.fr <mailto:pascal.bou...@univ-amu.fr><mailto:pascal.bou...@univ-amu.fr <mailto:pascal.bou...@univ-amu.fr>>
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at <mailto:Wien@zeus.theochem.tuwien.ac.at>
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien <http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien> SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html <http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html>

--
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email:bl...@theochem.tuwien.ac.at <mailto:bl...@theochem.tuwien.ac.at>   WIEN2k:http://www.wien2k.at <http://www.wien2k.at/>
WWW: http://www.imc.tuwien.ac.at <http://www.imc.tuwien.ac.at/>
-------------------------------------------------------------------------
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at <mailto:Wien@zeus.theochem.tuwien.ac.at>
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien <http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien> SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html <http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html>

Pascal Boulet
—
/Professor in computational materials chemistry - DEPARTMENT OF CHEMISTRY/
University of Aix-Marseille - Avenue Escadrille Normandie Niemen - F-13013 Marseille - FRANCE
Tél: +33(0)4 13 55 18 10 - Fax : +33(0)4 13 55 18 50
Email : pascal.bou...@univ-amu.fr <mailto:pascal.bou...@univ-amu.fr>





_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

--
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at
-------------------------------------------------------------------------
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

Reply via email to