Good afternoon -
I recently installed gromacs-4.6 on CentOS6.3 and the installation went
just fine.
I have a Tesla C2075 GPU.
I then downloaded the benchmark directories and ran a bench mark on the
GPU/ dhfr-solv-PME.bench
This is what I got:
Using 1 MPI thread
Using 4 OpenMP threads
1 GPU detected:
#0: NVIDIA Tesla C2075, compute cap.: 2.0, ECC: yes, stat: compatible
1 GPU user-selected for this run: #0
Back Off! I just backed up ener.edr to ./#ener.edr.1#
starting mdrun 'Protein in water'
-1 steps, infinite ps.
step 40: timed with pme grid 64 64 64, coulomb cutoff 1.000: 4122.9
M-cycles
step 80: timed with pme grid 56 56 56, coulomb cutoff 1.143: 3685.9
M-cycles
step 120: timed with pme grid 48 48 48, coulomb cutoff 1.333: 3110.8
M-cycles
step 160: timed with pme grid 44 44 44, coulomb cutoff 1.455: 3365.1
M-cycles
step 200: timed with pme grid 40 40 40, coulomb cutoff 1.600: 3499.0
M-cycles
step 240: timed with pme grid 52 52 52, coulomb cutoff 1.231: 3982.2
M-cycles
step 280: timed with pme grid 48 48 48, coulomb cutoff 1.333: 3129.2
M-cycles
step 320: timed with pme grid 44 44 44, coulomb cutoff 1.455: 3425.4
M-cycles
step 360: timed with pme grid 42 42 42, coulomb cutoff 1.524: 2979.1
M-cycles
optimal pme grid 42 42 42, coulomb cutoff 1.524
step 4300 performance: 1.8 ns/day
and from the nvidia-smi output:
Tue Apr 9 10:13:46 2013
+--+
| NVIDIA-SMI 4.304.37 Driver Version: 304.37
|
|---+--+--+
| GPU Name | Bus-IdDisp. | Volatile Uncorr.
ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute
M. |
|===+==+==|
| 0 Tesla C2075 | :03:00.0 On |
0 |
| 30% 67CP080W / 225W | 4% 200MB / 5375MB | 4%
Default |
+---+--+--+
+-+
| Compute processes: GPU
Memory |
| GPU PID Process name
Usage |
|=|
|0 22568 mdrun
59MB |
+-+
So I am only getting 1.8ns/day ! Is that right? It seems very very
small compared to the CPU test where I am getting the same:
step 200 performance: 1.8 ns/dayvol 0.79 imb F 14%
>From the md.log of the GPU test:
Detecting CPU-specific acceleration.
Present hardware specification:
Vendor: GenuineIntel
Brand: Intel(R) Xeon(R) CPU E5-2603 0 @ 1.80GHz
Family: 6 Model: 45 Stepping: 7
Features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc
pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3
tdt x2a
pic
Acceleration most likely to fit this hardware: AVX_256
Acceleration selected at GROMACS compile time: AVX_256
1 GPU detected:
#0: NVIDIA Tesla C2075, compute cap.: 2.0, ECC: yes, stat: compatible
1 GPU user-selected for this run: #0
Will do PME sum in reciprocal space.
Any thoughts as to why it is so slow?
many thanks!
Ben
--
Research Assistant Professor
North Carolina State University
Department of Molecular and Structural Biochemistry
128 Polk Hall
Raleigh, NC 27695
Phone: (919)-513-0698
Fax: (919)-515-2047
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists