Re: [Pw_forum] Running in Parallel
On Mon, Nov 28, 2016 at 7:49 PM, Mofrad, Amir Mehdi (MU-Student) < am...@mail.missouri.edu> wrote: However, after I compiled it to the newer version 6, apparently the > simulation gets frozen (it dumps some information on the output file > though). It recognizes the parallel version also (writes down the number of > processors at the beginning of the output file). So I can't diagnose what > the problem is. > you have to, because (I cannot be 100% sure, but 99.99%, yes) the problem is on your machine's side, not on QE side. As a first step, look at differences between the previous "make.sys" and the current "make.inc" Paolo -- Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, Univ. Udine, via delle Scienze 208, 33100 Udine, Italy Phone +39-0432-558216, fax +39-0432-558222 ___ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum
Re: [Pw_forum] Running in Parallel
I used to run the older version of QE in parallel on a cluster using SLURM with the following command in job script and it worked fine. mpirun pw.x -inp $INPUTFILE However, after I compiled it to the newer version 6, apparently the simulation gets frozen (it dumps some information on the output file though). It recognizes the parallel version also (writes down the number of processors at the beginning of the output file). So I can't diagnose what the problem is. Any help would be thoroughly appreciated. From: pw_forum-boun...@pwscf.org on behalf of Filippo SPIGA Sent: Wednesday, November 23, 2016 4:57:06 AM To: PWSCF Forum Subject: Re: [Pw_forum] Running in Parallel On Nov 22, 2016, at 11:48 PM, Mofrad, Amir Mehdi (MU-Student) wrote: > After I compiled version 6 I can't run it in parallel. A bit more information about how you compiled and how your run will be useful to understand your problem -- Filippo SPIGA ~ Quantum ESPRESSO Foundation ~ http://www.quantum-espresso.org ___ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum ___ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum
Re: [Pw_forum] Running in Parallel
On Nov 22, 2016, at 11:48 PM, Mofrad, Amir Mehdi (MU-Student) wrote: > After I compiled version 6 I can't run it in parallel. A bit more information about how you compiled and how your run will be useful to understand your problem -- Filippo SPIGA ~ Quantum ESPRESSO Foundation ~ http://www.quantum-espresso.org ___ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum
[Pw_forum] Running in Parallel
Dear QE users and developers, I had been using QE 5 on a cluster that was operated by SLURM in parallel before. After I compiled version 6 I can't run it in parallel. Has anyone had the same problem? Thank you. Amir M. Mofrad ___ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum
Re: [Pw_forum] Running in Parallel
Greetings, I believe the number of nodes (-nodes = integer) forces the number of nodes on which the specified number of CPUs will be used; in other words it is designed for cases where you want to force all your calculations to land on a single node to minimize the inter-node message pass. if you need more CPUs you need to specify that in the BATCH file ( -ntask = integer). Regards On Thu, Feb 11, 2016 at 1:21 PM, Mofrad, Amir Mehdi (MU-Student) < am...@mail.missouri.edu> wrote: > Dear all QE users and developers, > > > I want to run Quantum Espresso on a cluster using SLURM. My problem is > that whenever I request for one node in my batch file and say 24 > processors, on my output file I can see it says that it has run on 24 > processors. However, when I request for more than one node (say 10 nodes) > and this time 12 processors (which would be 120 processors overall), on my > output file it shows 12 processors but not 120 processors. > > I was wondering if anyone has worked with SLURM and would help me what the > problem is. > > > Best, > > > Amir M. Mofrad > > > > ___ > Pw_forum mailing list > Pw_forum@pwscf.org > http://pwscf.org/mailman/listinfo/pw_forum > -- *Hooman Yaghoobnejad* *PhD candidate, Department of Chemistry* *Missouri University of Science and Technology* *Rolla, MO 65401* *USA* ___ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum
[Pw_forum] Running in Parallel
Dear all QE users and developers, I want to run Quantum Espresso on a cluster using SLURM. My problem is that whenever I request for one node in my batch file and say 24 processors, on my output file I can see it says that it has run on 24 processors. However, when I request for more than one node (say 10 nodes) and this time 12 processors (which would be 120 processors overall), on my output file it shows 12 processors but not 120 processors. I was wondering if anyone has worked with SLURM and would help me what the problem is. Best, Amir M. Mofrad ___ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum
[Pw_forum] running in parallel
On Thursday 24 November 2005 05:42, Jaita Paul wrote: > when i try to run my job (pw.x) in parallel, the job gets killed with the > followin error msg: > from davcio : error #10 > i/o error in davcio see preceding message: check where you are writing temporary files. There is a rather lenghty discussion in the user's guide about problems in parallel runs P. -- Paolo Giannozzi e-mail: giannozz at nest.sns.it Scuola Normale SuperiorePhone: +39/050-509876, Fax:-563513 Piazza dei Cavalieri 7 I-56126 Pisa, Italy
[Pw_forum] running in parallel
hi. when i try to run my job (pw.x) in parallel, the job gets killed with the followin error msg: %% from davcio : error #10 i/o error in davcio %% stopping ... rank 3 in job 6 compute-0-8.local_49197 caused collective abort of all ranks exit status of rank 3: return code 0 The input file is: #!/bin/sh #$ -S /bin/sh #$ -cwd #$ -o stdout #$ -e stderr #$ -pe mpich 4 cd /home/jaita-data/espresso/espresso-2.1.5/test /home/jaita/mpich2/bin/mpdrun -hf hostfile -n 4 ../bin/pw.x -input x_s5v3_12X3X1.rx.inp > x_s5v3_12X3X1.rx.out my system is a 20 processor Xeon cluster(memory 20GB and OS is Red hat linux 3.0). thanks in advance. jaita.