Il giorno 08/giu/2011, alle ore 20.20, Vi Vo ha scritto:

> Is it possible that I double the number of CPUs and use -npool 2?    

Yes, it is possible, and that's what you should to in order to speed up the 
calculation.

>  I see the number of planes printed out in the beginning of the output is 
> 90CPUs.  Thus the max number of CPUs that I can use is 90CPUs, ie one CPU per 
> plane. 

The number of cpus employed for the FFT parallelization is 
nproc_pool=nproc_tot/npool. Therefore, if you double the number of processors 
(let's say from 90 to 180) and double the number of pools (from 1 to 2), then 
you will still have the same parallelization scheme for the FFT grid (within 
each pool).

In principle nothing forbids you to use more cpus than the number of FFT 
planes, but this might be inefficient. Then you should employ task groups or 
threading via OpenMP (for more detail, please refer to the user guide or to the 
QE paper, linked in the bibliography section of the quantum-espresso.org 
website)


HTH

GS
> 
> Thanks,
> 
> Vi
> From: Gabriele Sclauzero <sclauzer at sissa.it>
> To: PWSCF Forum <pw_forum at pwscf.org>
> Sent: Wed, June 8, 2011 12:18:49 AM
> Subject: Re: [Pw_forum] nscf restart
> 
> Dear Vi,
> 
> Il giorno 07/giu/2011, alle ore 23.46, Vi Vo ha scritto:
> 
>> Dear All,
>> 
>> I need to run nscf with a kpt-grid 17x17x17.  However, I can only have 24 
>> hrs to run, so the job won't be finished in that short time slot.  I will 
>> need to restart after every 24hrs.  If I use the 'restart' option, one thing 
>> I am worried is that after the first run, the scf charge density file will 
>> be overwritten and replaced by nscf charge density file. 
> 
> I don't think that the nscf run will overwrite the scf charge density file. I 
> think it will just be read and used to compute the scf potential. What will 
> be changed are the eigenfunctions in .wfc and the eigenvalues in the restart 
> files inside .save
> 
>> When the job is restarted, the charge density file saved in previous run and 
>> the *.wfc files will be read.  Is the continuing nscf calculation still 
>> correct?
> 
> I remember that this could be done it correctly if one specifies 
> disk_io="high". Then some additional files should be written to keep track at 
> which k-point and band the calculation is stopped. At that time the 
> max_seconds option was not working in that case (because the check is outside 
> the subroutine electrons), so the run is interrupted "brutally" (i.e. with 
> kill by the queuing system, in your case). Anyway the restart should work 
> fine anyway if you add this option. You can also add verbosity="high" to see 
> how many k-points have been computed up to that point. 
> 
>> 
>> One other option is that I can look at how many kpts needed for the grid 
>> 17x17x17 by using the kpt list printed out in the output file if 17x17x17 is 
>> used, then run smaller jobs, each of which, for example, includes the nscf 
>> calculation of 20 kpts.  However, when I started the job in this way, more 
>> kpts than those that I specified in the input file were calculated, eg 
>> 40kpts instead of 20kpts.  I understand the code searched for other 
>> equivalent kpts and calculate them.  Because of this, the job required 
>> longer time to finish all 40 kpts than the time I plan.  In order to avoid 
>> this, I specified the option "nosym=.true.", so that only those kpts I 
>> specify in the input file are calculated.  However, I am not sure if it is 
>> correct to do in this way. 
> 
> Not sure either that this gives you exactly what you want and I don't know if 
> there are other side effects.
> 
> 
>> Another point encountered is that by chopping into smaller jobs, the kpt 
>> weight in each 20kpts-job is not correct anymore due to the way the code 
>> compute the kpt weight. 
> 
> Of course if you want to compute DOS or PDOS, and therefore need also correct 
> weights, the above method is not the ideal.
> 
> What about using more processors together with pools? You should be able to 
> reduce by about a half the running time by doubling the number of processors 
> with -npool 2, for instance.
> 
> 
> HTH
> 
> 
> GS
> 
> 
>> Could you give an advise if there is anything wrong on way I described above?
>> 
>> Thank you very much,
>> 
>> Vi
>> University of Houston      
>> 
>> _______________________________________________
>> Pw_forum mailing list
>> Pw_forum at pwscf.org
>> http://www.democritos.it/mailman/listinfo/pw_forum
> 
> 
> ? Gabriele Sclauzero, EPFL SB ITP CSEA
>    PH H2 462, Station 3, CH-1015 Lausanne
> 
> _______________________________________________
> Pw_forum mailing list
> Pw_forum at pwscf.org
> http://www.democritos.it/mailman/listinfo/pw_forum


? Gabriele Sclauzero, EPFL SB ITP CSEA
   PH H2 462, Station 3, CH-1015 Lausanne

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://www.democritos.it/pipermail/pw_forum/attachments/20110609/16a18c6c/attachment-0001.htm
 

Reply via email to