[Wien] No lines in spaghetti

2011-06-07 Thread Sanjeev K. Srivastava
Dear Wien2K users

I am trying to plot bandstructure of a monoclinic CXZ lattice. I picked the 
standard symmetry points of the BZ of this monoclinic lattice. Everything went 
well except that there are no lines in the bandstructure. Only dots at symmetry 
points. What could be the reason/remedy?

Best regards

Sanjeev

-- 
Dr. Sanjeev Kumar Srivastava
Assistant Professor
Department of Physics and Meteorology
Indian Institute of Technology Kharagpur
Kharagpur 721302
India

Ph.: 0091-3222-283854 (Office)
 0091-3222-283855 (Residence)
Mobile:  0091-9735444091
---


[Wien] No lines in spaghetti

2011-06-07 Thread Sanjeev K. Srivastava
Dear Wien2K users

You need not reply to this mail. I have got the solution.

Sorry for inconvenience.

Best regards

Sanjeev

- Original Message -
From: Sanjeev K. Srivastava sanj...@phy.iitkgp.ernet.in
To: A Mailing list for WIEN2k users wien at zeus.theochem.tuwien.ac.at
Sent: Tuesday, June 7, 2011 10:45:49 AM
Subject: [Wien] No lines in spaghetti

Dear Wien2K users

I am trying to plot bandstructure of a monoclinic CXZ lattice. I picked the 
standard symmetry points of the BZ of this monoclinic lattice. Everything went 
well except that there are no lines in the bandstructure. Only dots at symmetry 
points. What could be the reason/remedy?

Best regards

Sanjeev

-- 
Dr. Sanjeev Kumar Srivastava
Assistant Professor
Department of Physics and Meteorology
Indian Institute of Technology Kharagpur
Kharagpur 721302
India

Ph.: 0091-3222-283854 (Office)
 0091-3222-283855 (Residence)
Mobile:  0091-9735444091
---
___
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

-- 
Dr. Sanjeev Kumar Srivastava
Assistant Professor
Department of Physics and Meteorology
Indian Institute of Technology Kharagpur
Kharagpur 721302
India

Ph.: 0091-3222-283854 (Office)
 0091-3222-283855 (Residence)
Mobile:  0091-9735444091
---


[Wien] pondering wien2kMPI performance

2011-06-07 Thread Robert Laskowski
Hi,
lapw0 is parallelized in loop over atoms, there is little communication here, 
and fftw, for that look at fftw manual. For lapw1, setup has no communication 
at all, eigensolver is done with pblas and scalapack calls, here both latency 
and bandwidth are important, but these libraries should be well optimized, I 
would point more to bandwidth. lapw2 uses two coexisting communicators, one 
for parallelization vs atoms, and the other for splitting vector 
(lapw2_vector_split  in .machines), for large systems you have to split 
vector. I guess that here major time is used in pblas calls, which are 
matrix/matrix multiplications, however on some old and less efficient systems 
we have notice huge time spend on reading and distributing the vector file. 

regards

Robert
  

On Monday 06 June 2011 23:31:44 Kevin Jorissen wrote:
 Dear wien2k community,
 
 I have a few basic questions regarding the MPI/SCALAPACK version of wien2k
 :
 
 * does anyone have a formula for calculating the memory requirements
 of the code (lapw0/1/2) given, say, nmat and nume and the number of
 cores used?  It's easy enough for the serial code, but I'm sometimes
 baffled by the memory taken by each of the MPI threads when
 distributing the job over N cores.  It's sometimes very different from
 [serial size in GB] / N_cores.  It makes the queue manager unhappy,
 and occasionally I unintentionally overload a node this way.
 
 * I was asked the following question about the MPI wien2k code :
  So would it be correct to state that your apps are more bandwidth
  sensitive than latency sensitive?
 
 and I don't know what to answer.  Thinking about LARGE calculations
 (hundreds of atoms) I want to say that both will be important ...
 Does anyone have a more sophisticated insight here?
 
 
 
 cheers,
 
 
 Kevin Jorissen
 University of Washington
 ___
 Wien mailing list
 Wien at zeus.theochem.tuwien.ac.at
 http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

-- 
Dr Robert Laskowski
Vienna University of Technology, Institute of Materials Chemistry, 
Getreidemarkt 9/165-TC, A-1060 Vienna, Austria
tel. +43 1 58801 15675   Fax  +43 1 58801 15698


[Wien] pondering wien2kMPI performance

2011-06-07 Thread Peter Blaha
Just adding three more comments to what Robert said:

memory in lapw0: depends mainly on GMAX (in2) and/or   IFFT-parameters and 
enhancement factor in case.in0
  (memory critical for large FFT grids (enhancement factors), 
parallelization solves the problem)

   lapw1: SCALAPACK diagonalization needs significantly more memory 
then sequential LAPACK.
  (instead of (NMAT*(NMAT+1)/2) you need NMAT**2 for H and S, 
plus additional large auxiliary arrays.
   iterative diagonalization needs another large NMAT**2 array 
+ vectors (NMAT*NUME)

   lapw2: in most cases the real memory critical step  There are 
many cases where lapw1 still
  does fine in terms of memory, but lapw2 does NOT !!! Solve it 
by using
  lapw2_vector_split: 2  (or even 4) in  .machines file

Check the parallel case.output* files to get an idea about memory allocation.

Am 07.06.2011 08:31, schrieb Robert Laskowski:
 Hi,
 lapw0 is parallelized in loop over atoms, there is little communication here,
 and fftw, for that look at fftw manual. For lapw1, setup has no communication
 at all, eigensolver is done with pblas and scalapack calls, here both latency
 and bandwidth are important, but these libraries should be well optimized, I
 would point more to bandwidth. lapw2 uses two coexisting communicators, one
 for parallelization vs atoms, and the other for splitting vector
 (lapw2_vector_split  in .machines), for large systems you have to split
 vector. I guess that here major time is used in pblas calls, which are
 matrix/matrix multiplications, however on some old and less efficient systems
 we have notice huge time spend on reading and distributing the vector file.

 regards

 Robert


 On Monday 06 June 2011 23:31:44 Kevin Jorissen wrote:
 Dear wien2k community,

 I have a few basic questions regarding the MPI/SCALAPACK version of wien2k
 :

 * does anyone have a formula for calculating the memory requirements
 of the code (lapw0/1/2) given, say, nmat and nume and the number of
 cores used?  It's easy enough for the serial code, but I'm sometimes
 baffled by the memory taken by each of the MPI threads when
 distributing the job over N cores.  It's sometimes very different from
 [serial size in GB] / N_cores.  It makes the queue manager unhappy,
 and occasionally I unintentionally overload a node this way.

 * I was asked the following question about the MPI wien2k code :
 So would it be correct to state that your apps are more bandwidth
 sensitive than latency sensitive?

 and I don't know what to answer.  Thinking about LARGE calculations
 (hundreds of atoms) I want to say that both will be important ...
 Does anyone have a more sophisticated insight here?



 cheers,


 Kevin Jorissen
 University of Washington
 ___
 Wien mailing list
 Wien at zeus.theochem.tuwien.ac.at
 http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien


-- 

   P.Blaha
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-15671 FAX: +43-1-58801-15698
Email: blaha at theochem.tuwien.ac.atWWW: http://info.tuwien.ac.at/theochem/
--


[Wien] compilation aborted for lap_bp.f (code 1)

2011-06-07 Thread shamik chakrabarti
Dear wien2k users,

  We have tried to install wien2k in a 64 bit system
using compiler 11.1.046.
  The OPTIONS used are given below:

current:FOPT:-FR -mp1 -w -prec_div -pad -ip -O3 -axTW -traceback
current:FPOPT:$(FOPT)
current:LDFLAGS:$(FOPT) -L/opt/intel/Compiler/11.1/046/lib/intel64
-static-intel -Bstatic -lguide -lguide_stats -lsvml -Bdynamic -lpthread
current:DPARALLEL:'-DParallel'
current:R_LIBS:-L/opt/intel/Compiler/11.1/046/mkl/lib/em64t -lguide
-lpthread
current:RP_LIBS:-lmkl_intel_lp64 -lmkl_scalapack_lp64 -lmkl_blacs_lp64
-lmkl_sequential -lmkl_em64t

All the programs were compiled properly except lapwso. 1 error appeared as
follows:

*compilation aborted for lap_bp.f (code 1)*
*make: *** [lap_bp.o] Error 1*

We are not able to proceed any further. Any response in this regard will be
appreciated. Thanks in advance,

with best regards,
-- 
Shamik Chakrabarti
Research Scholar
Dept. of Physics  Meteorology
Material Processing  Solid State Ionics Lab
IIT Kharagpur
Kharagpur 721302
INDIA
-- next part --
An HTML attachment was scrubbed...
URL: 
http://zeus.theochem.tuwien.ac.at/pipermail/wien/attachments/20110607/8b03a36a/attachment.htm


[Wien] GGA-EV

2011-06-07 Thread t...@theochem.tuwien.ac.at
The EV-GGA functional corresponds to indxc=15 in case.in0.
This means EV93 for exchange and PW91 for correlation.

On Tue, 7 Jun 2011, AJAY SINGH VERMA wrote:

 
 Dear all users and Blaha Sir,
 Please clarify me that many papers quotes the results with EV-GGA functional, 
 but iam unable to find it out in the Userguide
 thanking u.
 A S Verma 
 


[Wien] compilation aborted for lap_bp.f (code 1)

2011-06-07 Thread Gerhard Fecher
What was the error message of the compiler ???

By the way
The latest stable version of the 11.1 compiler was 11.1.075
older ones contain an overestimation that you can easily find when you search 
this forum
and the same procedure as every week
http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/

and about compiler switches the Intel manual for 11.1 tells for example
(see also the section Deprecated and Removed Compiler Options in the manual)

ax, Qax

Tells the compiler to generate multiple, processor-specific auto-dispatch code 
paths for Intel processors if there is a performance benefit. 

IDE Equivalent
Windows: Code Generation  Add Processor-Optimized Code Path

Optimization  Generate Alternate Code Paths

Linux: None

Mac OS X: Code Generation  Add Processor-Optimized Code Path 

Architectures
IA-32, Intel? 64 architectures

Syntax
Linux and Mac OS X:
 -axprocessor
 
Windows:
 /Qaxprocessor
 

Arguments
processor
 Indicates the processor for which code is generated. The following 
descriptions refer to Intel? Streaming SIMD Extensions (Intel? SSE) and 
Supplemental Streaming SIMD Extensions (Intel? SSSE). Possible values are:

SSE4.2
 Can generate Intel? SSE4 Efficient Accelerated String and Text Processing 
instructions supported by Intel? Core? i7 processors. Can generate Intel? SSE4 
Vectorizing Compiler and Media Accelerator, Intel? SSSE3, SSE3, SSE2, and SSE 
instructions and it can optimize for the Intel? Core? processor family.
 
SSE4.1
 Can generate Intel? SSE4 Vectorizing Compiler and Media Accelerator 
instructions for Intel processors. Can generate Intel? SSSE3, SSE3, SSE2, and 
SSE instructions and it can optimize for Intel? 45nm Hi-k next generation 
Intel? Core? microarchitecture. This replaces value S, which is deprecated.
 
SSSE3
 Can generate Intel? SSSE3, SSE3, SSE2, and SSE instructions for Intel 
processors and it can optimize for the Intel? Core?2 Duo processor family. For 
Mac OS* X systems, this value is only supported on Intel? 64 architecture. This 
replaces value T, which is deprecated.
 
SSE3
 Can generate Intel? SSE3, SSE2, and SSE instructions for Intel processors and 
it can optimize for processors based on Intel? Core? microarchitecture and 
Intel NetBurst? microarchitecture. For Mac OS* X systems, this value is only 
supported on IA-32 architecture. This replaces value P, which is deprecated.
 
SSE2
 Can generate Intel? SSE2 and SSE instructions for Intel processors, and it can 
optimize for Intel? Pentium? 4 processors, Intel? Pentium? M processors, and 
Intel? Xeon? processors with Intel? SSE2. This value is not available on Mac 
OS* X systems. This replaces value N, which is deprecated.
 
 

Default
OFF
 No auto-dispatch code is generated. Processor-specific code is generated and 
is controlled by the setting of compiler option -m (Linux), compiler option 
/arch (Windows), or compiler option -x (Mac OS* X).
 

Description
This option tells the compiler to generate multiple, processor-specific 
auto-dispatch code paths for Intel processors if there is a performance 
benefit. It also generates a baseline code path. The baseline code is usually 
slower than the specialized code.

The baseline code path is determined by the architecture specified by the -x 
(Linux and Mac OS X) or /Qx (Windows) option. While there are defaults for the 
-x or /Qx option that depend on the operating system being used, you can 
specify an architecture for the baseline code that is higher or lower than the 
default. The specified architecture becomes the effective minimum architecture 
for the baseline code path. 

If you specify both the -ax and -x options (Linux and Mac OS X) or the /Qax and 
/Qx options (Windows), the baseline code will only execute on processors 
compatible with the processor type specified by the -x or /Qx option. 

This option tells the compiler to find opportunities to generate separate 
versions of functions that take advantage of features of the specified Intel? 
processor. 

If the compiler finds such an opportunity, it first checks whether generating a 
processor-specific version of a function is likely to result in a performance 
gain. If this is the case, the compiler generates both a processor-specific 
version of a function and a baseline version of the function. At run time, one 
of the versions is chosen to execute, depending on the Intel processor in use. 
In this way, the program can benefit from performance gains on more advanced 
Intel processors, while still working properly on older processors.

You can use more than one of the processor values by combining them. For 
example, you can specify -axSSE4.1,SSSE3 (Linux and Mac OS X) or 
/QaxSSE4.1,SSSE3 (Windows). You cannot combine the old style, deprecated 
options and the new options. For example, you cannot specify -axSSE4.1,T (Linux 
and Mac OS X) or /QaxSSE4.1,T (Windows). 

Previous values W and K are deprecated. The details on replacements are as 
follows:

Mac OS X systems: 

[Wien] GGA-EV

2011-06-07 Thread Gerhard Fecher
and note that this functional is optimized to reproduce the correct exchange
it is not suitable for total energy calculations, therefore don't use it for 
optimization

Ciao
Gerhard


Dr. Gerhard H. Fecher
Institut of Inorganic and Analytical Chemistry
Johannes Gutenberg - University
55099 Mainz

Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at 
zeus.theochem.tuwien.ac.at]quot; im Auftrag von quot;tran at 
theochem.tuwien.ac.at [tran at theochem.tuwien.ac.at]
Gesendet: Dienstag, 7. Juni 2011 11:06
Bis: A Mailing list for WIEN2k users
Betreff: Re: [Wien] GGA-EV

The EV-GGA functional corresponds to indxc=15 in case.in0.
This means EV93 for exchange and PW91 for correlation.

On Tue, 7 Jun 2011, AJAY SINGH VERMA wrote:


 Dear all users and Blaha Sir,
 Please clarify me that many papers quotes the results with EV-GGA functional, 
 but iam unable to find it out in the Userguide
 thanking u.
 A S Verma

___
Wien mailing list
Wien at zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien


[Wien] GGA-EV

2011-06-07 Thread Peter Blaha
And in addition: Nowadays I would NOT use EV-GGA, but the mBJ potential.
It gives gaps with much higher accuracy.

Am 07.06.2011 11:06, schrieb Gerhard Fecher:
 and note that this functional is optimized to reproduce the correct exchange
 it is not suitable for total energy calculations, therefore don't use it for 
 optimization

 Ciao
 Gerhard

 
 Dr. Gerhard H. Fecher
 Institut of Inorganic and Analytical Chemistry
 Johannes Gutenberg - University
 55099 Mainz
 
 Von: wien-bounces at zeus.theochem.tuwien.ac.at [wien-bounces at 
 zeus.theochem.tuwien.ac.at]quot; im Auftrag vonquot;tran at 
 theochem.tuwien.ac.at [tran at theochem.tuwien.ac.at]
 Gesendet: Dienstag, 7. Juni 2011 11:06
 Bis: A Mailing list for WIEN2k users
 Betreff: Re: [Wien] GGA-EV

 The EV-GGA functional corresponds to indxc=15 in case.in0.
 This means EV93 for exchange and PW91 for correlation.

 On Tue, 7 Jun 2011, AJAY SINGH VERMA wrote:


 Dear all users and Blaha Sir,
 Please clarify me that many papers quotes the results with EV-GGA 
 functional, but iam unable to find it out in the Userguide
 thanking u.
 A S Verma

 ___
 Wien mailing list
 Wien at zeus.theochem.tuwien.ac.at
 http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
 ___
 Wien mailing list
 Wien at zeus.theochem.tuwien.ac.at
 http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien

-- 

   P.Blaha
--
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-15671 FAX: +43-1-58801-15698
Email: blaha at theochem.tuwien.ac.atWWW: http://info.tuwien.ac.at/theochem/
--