Re: [Wien] lapwso_mpi error

Gavin Abo Sat, 12 Nov 2016 16:58:12 -0800

If you use the terminal command: echo $SCRATCH

Does it return:

./

Looks like there might still be a problem with how SCRATCH is defined orhow "./" is resolved by your system.


In the error message, you can see:

/lunarc/nobackup/users/eishfh/WIEN2k/GaAs_ZB/David_project/3Mn001/ALL/test-so/*3Mn/./3Mn.vectordn_1*

The "./" may be the cause of the problem, because I would expect thepath to be:


/lunarc/nobackup/users/eishfh/WIEN2k/GaAs_ZB/David_project/3Mn001/ALL/test-so/*3Mn/3Mn.vectordn_1

*On 11/12/2016 5:33 PM, Md. Fhokrul Islam wrote:

Hi Prof. Blaha,
I wasn't aware of the bug but I will check the updates. I haverepeated calculation
with 16 cores (square processor grid) as you suggested but I still gotthe same error.
As before, job crashes at lapwso. I don't see any missing file as youcan see from the
list of vector files.


-rw-r--r--. 1 eishfh kalmar 12427583862 Nov 12 10:04 3Mn.vectordn_1

-rw-r--r--. 1 eishfh kalmar       77760 Nov 12 10:26 3Mn.vectorsodn_1

-rw-r--r--. 1 eishfh kalmar       77760 Nov 12 10:26 3Mn.vectorsoup_1

-rw-r--r--. 1 eishfh kalmar 12428559726 Nov 12 04:17 3Mn.vectorup_1
Here are the dayfile and output error files. These are the only errormessages I got.
case.dayfile:


cycle 1     (Sat Nov 12 01:21:39 CET 2016)  (100/99 to go)
> lapw0 -p (01:21:39) starting parallel lapw0 at Sat Nov 1201:21:39 CET 2016
-------- .machine0 : 16 processors

14031.329u 15.362s 14:40.87 1594.6%     0+0k 90152+1974560io 175pf+0w
> lapw1 -up -p -c (01:36:20) starting parallel lapw1 at Sat Nov12 01:36:20 CET 2016
-> starting parallel LAPW1 jobs at Sat Nov 12 01:36:20 CET 2016

running LAPW1 in parallel mode (using .machines)

1 number_of_parallel_jobs
au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188au188 au188 au188 au188 au188(1) 121331.481u 33186.223s 2:41:04.621598.7% 0+0k 0+29485672io 118pf+0w
Summary of lapw1para:

au188         k=0     user=0  wallclock=0
121367.583u 33215.702s 2:41:06.83 1599.1% 0+0k 288+29487024io121pf+0w
> lapw1 -dn -p -c (04:17:27) starting parallel lapw1 at Sat Nov12 04:17:27 CET 2016
-> starting parallel LAPW1 jobs at Sat Nov 12 04:17:27 CET 2016

running LAPW1 in parallel mode (using .machines.help)

1 number_of_parallel_jobs
au188 au188 au188 au188 au188 au188 au188 au188 au188 au188 au188au188 au188 au188 au188 au188(1) 233187.228u 100041.449s 5:47:30.001598.2% 0+0k 5832+35169304io 116pf+0w
Summary of lapw1para:

au188         k=0     user=0  wallclock=0
233263.580u 100102.639s 5:47:31.69 1598.7% 0+0k 6296+35170640io118pf+0w
>   lapwso -up  -p -c   (10:04:59) running LAPWSO in parallel mode

** LAPWSO crashed!

1233.319u 23.612s 21:29.72 97.4%        0+0k 13064+7712io 17pf+0w
error: command/lunarc/nobackup/users/eishfh/SRC/Wien2k14.2-iomkl/lapwsopara -up -clapwso.def failed
>   stop error

-----------------------

lapwso.error file:

** Error in Parallel LAPWSO

** Error in Parallel LAPWSO


-----------------------

output error file:

 LAPW0 END

 LAPW1 END

 LAPW1 END
forrtl: severe (39): error during read, unit 9, file/lunarc/nobackup/users/eishfh/WIEN2k/GaAs_ZB/David_project/3Mn001/ALL/test-so/3Mn/./3Mn.vectordn_1
Image             PC                Routine            Line     Source

lapwso_mpi         00000000004634E3  Unknown               Unknown Unknown

lapwso_mpi         000000000047F3C4  Unknown               Unknown Unknown

lapwso_mpi         000000000042BA1F  kptin_                     56 kptin.F
lapwso_mpi 0000000000431566 MAIN__ 523lapwso.F
lapwso_mpi         000000000040B3EE  Unknown               Unknown Unknown

libc.so.6         00002BA34EDECB15  Unknown               Unknown Unknown

lapwso_mpi         000000000040B2E9  Unknown               Unknown Unknown


-----------------------


Thanks,
Fhokrul

_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

Re: [Wien] lapwso_mpi error

Reply via email to