Dear folks, I've managed to compile the gpu-enabled version of PWSCF from the sources provided by Filippo Spiga using Portland compilers and MKL libraries. The node has CentOS 6.6 and 16 GB of RAM. I ran some small tests and results were the same than those obtained with a non-gpu version of qe-6.1 compiled with the GNU compilers.
When I try to test the executable with a more realistic job (a Cu surface made by 75 atoms with 1 to 6 carbon atoms on it) an "out of memory" problem occurs and the job terminates. I must say that that job was successfully ran on another similar node (except for the fact that it doesn't have a gpu card). When I use "mpirun -np 1" before invoking pw.x, I've got ... Estimated max dynamical RAM per process > 10128.95MB Generating pointlists ... new r_m : 0.0689 (alat units) 1.6647 (a.u.) for type 1 new r_m : 0.0689 (alat units) 1.6647 (a.u.) for type 2 0: ALLOCATE: 2525186688 bytes requested; status = 2(out of memory) /opt/pgi/linux86-64/17.4/lib/libpgf90_rpm1.so(__fort_abortx+0x17) [0x2b646f7f2af7] /opt/pgi/linux86-64/17.4/lib/libpgf90.so(__fort_abort+0x5e) [0x2b646f41897e] /opt/pgi/linux86-64/17.4/lib/libcudafor.so(+0x5ac38) [0x2b6456f6cc38] /opt/pgi/linux86-64/17.4/lib/libcudafor.so(pgf90_dev_mod_alloc04+0xc9) [0x2b6456f6d70e] /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x5e16d7] /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x497953] /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x52b05d] /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x40d82c] /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x40d704] /lib64/libc.so.6(__libc_start_main+0xfd) [0x36f741ed5d] /usr/local/fspiga-qe-gpu-7e1de44/bin/pw.x() [0x40a1c9] -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 23370 on node n13 exited on signal 6 (Aborted). -------------------------------------------------------------------------- When I ran the same job with "mpirun -np 8" then I've got ... Estimated total allocated dynamical RAM > 11048.12MB Generating pointlists ... new r_m : 0.0689 (alat units) 1.6647 (a.u.) for type 1 new r_m : 0.0689 (alat units) 1.6647 (a.u.) for type 2 0: ALLOCATE: 315564672 bytes requested; status = 2(out of memory) [a lot of error messages] I cannot understand the source of the error but I guess that it has to do with the gpu card. Running the deviceQuery program that comes with CUDA I've got (among a lot of information) Device 0: "TITAN X (Pascal)" CUDA Driver Version / Runtime Version 8.0 / 8.0 CUDA Capability Major/Minor version number: 6.1 Total amount of global memory: 12189 MBytes (12781158400 bytes) (28) Multiprocessors, (128) CUDA Cores/MP: 3584 CUDA Cores GPU Max Clock rate: 1531 MHz (1.53 GHz) Memory Clock rate: 5005 Mhz Memory Bus Width: 384-bit L2 Cache Size: 3145728 bytes Any help is welcome. I can provide the proper input file with the corresponding pseudopotentials if requested. Thanks in advance Reinaldo Pis Diez Center of Inorganic Chemistry Natl Univ of La Plata Argentina _______________________________________________ Pw_forum mailing list Pw_forum@pwscf.org http://pwscf.org/mailman/listinfo/pw_forum