Re: [Pw_forum] Running in Parallel

2016-11-29 Thread Paolo Giannozzi
On Mon, Nov 28, 2016 at 7:49 PM, Mofrad, Amir Mehdi (MU-Student) <
am...@mail.missouri.edu> wrote:

However, after I compiled it to the newer version 6, apparently the
> simulation gets frozen (it dumps some information on the output file
> though). It recognizes the parallel version also (writes down the number of
> processors at the beginning of the output file). So I can't diagnose what
> the problem is.
>

you have to, because (I cannot be 100% sure, but 99.99%, yes) the problem
is on your machine's side, not on QE side. As a first step, look at
differences between the previous "make.sys" and the current "make.inc"

Paolo

-- 
Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche,
Univ. Udine, via delle Scienze 208, 33100 Udine, Italy
Phone +39-0432-558216, fax +39-0432-558222
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Running in Parallel

2016-11-28 Thread Mofrad, Amir Mehdi (MU-Student)
I used to run the older version of QE in parallel on a cluster using SLURM with 
the following command in job script and it worked fine.

  mpirun pw.x -inp $INPUTFILE


However, after I compiled it to the newer version 6, apparently the simulation 
gets frozen (it dumps some information on the output file though). It 
recognizes the parallel version also (writes down the number of processors at 
the beginning of the output file). So I can't diagnose what the problem is. Any 
help would be thoroughly appreciated.


From: pw_forum-boun...@pwscf.org  on behalf of 
Filippo SPIGA 
Sent: Wednesday, November 23, 2016 4:57:06 AM
To: PWSCF Forum
Subject: Re: [Pw_forum] Running in Parallel

On Nov 22, 2016, at 11:48 PM, Mofrad, Amir Mehdi (MU-Student) 
 wrote:
> After I compiled version 6 I can't run it in parallel.

A bit more information about how you compiled and how your run will be useful 
to understand your problem

--
Filippo SPIGA ~ Quantum ESPRESSO Foundation ~ http://www.quantum-espresso.org


___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Running in Parallel

2016-11-23 Thread Filippo SPIGA
On Nov 22, 2016, at 11:48 PM, Mofrad, Amir Mehdi (MU-Student) 
 wrote:
> After I compiled version 6 I can't run it in parallel.

A bit more information about how you compiled and how your run will be useful 
to understand your problem

--
Filippo SPIGA ~ Quantum ESPRESSO Foundation ~ http://www.quantum-espresso.org


___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum


[Pw_forum] Running in Parallel

2016-11-22 Thread Mofrad, Amir Mehdi (MU-Student)
Dear QE users and developers,

I had been using QE 5 on a cluster that was operated by SLURM in parallel 
before. After I compiled version 6 I can't run it in parallel. Has anyone had 
the same problem? Thank you.



Amir M. Mofrad
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

Re: [Pw_forum] Running in Parallel

2016-02-11 Thread Hooman Yaghoobnejad Asl
Greetings,

I believe the number of nodes (-nodes = integer) forces the number of nodes
on which the specified number of CPUs will be used; in other words it is
designed for cases where you want to force all your calculations to land on
a single node to minimize the inter-node message pass. if you need more
CPUs you need to specify that in the BATCH file ( -ntask = integer).

Regards

On Thu, Feb 11, 2016 at 1:21 PM, Mofrad, Amir Mehdi (MU-Student) <
am...@mail.missouri.edu> wrote:

> Dear all QE users and developers,
>
>
> I want to run Quantum Espresso on a cluster using SLURM. My problem is
> that whenever I request for one node in my batch file and say 24
> processors, on my output file I can see it says that it has run on 24
> processors. However, when I request for more than one node (say 10 nodes)
> and this time 12 processors (which would be 120 processors overall), on my
> output file it shows 12 processors but not 120 processors.
>
> I was wondering if anyone has worked with SLURM and would help me what the
> problem is.
>
>
> Best,
>
>
> Amir M. Mofrad
>
>
>
> ___
> Pw_forum mailing list
> Pw_forum@pwscf.org
> http://pwscf.org/mailman/listinfo/pw_forum
>



-- 

*Hooman Yaghoobnejad*

*PhD candidate, Department of Chemistry*

*Missouri University of Science and Technology*

*Rolla, MO 65401*
*USA*
___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

[Pw_forum] Running in Parallel

2016-02-11 Thread Mofrad, Amir Mehdi (MU-Student)
Dear all QE users and developers,


I want to run Quantum Espresso on a cluster using SLURM. My problem is that 
whenever I request for one node in my batch file and say 24 processors, on my 
output file I can see it says that it has run on 24 processors. However, when I 
request for more than one node (say 10 nodes) and this time 12 processors 
(which would be 120 processors overall), on my output file it shows 12 
processors but not 120 processors.

I was wondering if anyone has worked with SLURM and would help me what the 
problem is.


Best,


Amir M. Mofrad



___
Pw_forum mailing list
Pw_forum@pwscf.org
http://pwscf.org/mailman/listinfo/pw_forum

[Pw_forum] running in parallel

2005-11-24 Thread Paolo Giannozzi
On Thursday 24 November 2005 05:42, Jaita Paul wrote:

> when i try to run my job (pw.x) in parallel, the job gets killed with the
> followin error msg:

>  from davcio : error #10 
>  i/o error in davcio

see preceding message: check where you are writing temporary files.
There is a rather lenghty discussion in the user's guide about 
problems in parallel runs

P.

-- 
Paolo Giannozzi e-mail:  giannozz at nest.sns.it
Scuola Normale SuperiorePhone:   +39/050-509876, Fax:-563513 
Piazza dei Cavalieri 7  I-56126 Pisa, Italy



[Pw_forum] running in parallel

2005-11-24 Thread Jaita Paul
hi.

when i try to run my job (pw.x) in parallel, the job gets killed with the
followin error msg:

 %%
 from davcio : error #10
 i/o error in davcio
 %%

 stopping ...
rank 3 in job 6  compute-0-8.local_49197   caused collective abort of all
ranks
  exit status of rank 3: return code 0


The input file is:

#!/bin/sh
#$ -S /bin/sh
#$ -cwd
#$ -o stdout
#$ -e stderr
#$ -pe mpich 4
cd /home/jaita-data/espresso/espresso-2.1.5/test
/home/jaita/mpich2/bin/mpdrun -hf hostfile -n 4 ../bin/pw.x  -input
x_s5v3_12X3X1.rx.inp > x_s5v3_12X3X1.rx.out

my system is a 20 processor Xeon cluster(memory 20GB and OS is Red hat linux
3.0).

thanks in advance.
jaita.