Re: [gmx-users] GPU-gromacs

2013-10-25 Thread Carsten Kutzner
On Oct 25, 2013, at 4:07 PM, aixintiankong  wrote:

> Dear prof.,
> i want install gromacs on a multi-core workstation with a GPU(tesla c2075), 
> should i install the openmpi or mpich2? 
If you want to run Gromacs on just one workstation with a single GPU, you do
not need to install an MPI library at all!

Carsten

> -- 
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the 
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


--
Dr. Carsten Kutzner
Max Planck Institute for Biophysical Chemistry
Theoretical and Computational Biophysics
Am Fassberg 11, 37077 Goettingen, Germany
Tel. +49-551-2012313, Fax: +49-551-2012302
http://www.mpibpc.mpg.de/grubmueller/kutzner
http://www.mpibpc.mpg.de/grubmueller/sppexa

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU version of Gromacs

2013-08-19 Thread Justin Lemkul



On 8/19/13 5:38 AM, grita wrote:

Hey guys,

Is it possible to make a SD simulation with using the pull code in the GPU
version of Gromacs?



Have you tried it?

-Justin

--
==

Justin A. Lemkul, Ph.D.
Postdoctoral Fellow

Department of Pharmaceutical Sciences
School of Pharmacy
Health Sciences Facility II, Room 601
University of Maryland, Baltimore
20 Penn St.
Baltimore, MD 21201

jalem...@outerbanks.umaryland.edu | (410) 706-7441

==
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU metadynamics

2013-08-15 Thread Albert

On 08/15/2013 11:21 AM, Jacopo Sgrignani wrote:

Dear Albert
to run parallel jobs on multiple GPUs you should use something like this:

mpirun -np (number of parallel sessions on CPU) mdrun_mpi .. -gpu_id 


so you will have 4 calculations for GPU.


Jacopo


thanks a  lot for reply. but there is some problem with following command:

mpirun -np 4 mdrun_mpi -s md.tpr -v -g md.log -o md.trr -x md.xtc 
-plumed plumed2.dat -e md.edr -gpu_id 0123.


--log---

4 GPUs detected on host node3:
  #0: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
  #1: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
  #2: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
  #3: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible


---
Program mdrun_mpi, VERSION 4.6.3
Source code file: 
/home/albert/install/source/gromacs-4.6.3/src/gmxlib/gmx_detect_hardware.c, 
line: 349


Fatal error:
Incorrect launch configuration: mismatching number of PP MPI processes 
and GPUs per node.
mdrun_mpi was started with 1 PP MPI process per node, but you provided 4 
GPUs.

For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
---

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU metadynamics

2013-08-15 Thread Jacopo Sgrignani
Dear Albert
to run parallel jobs on multiple GPUs you should use something like this:

mpirun -np (number of parallel sessions on CPU) mdrun_mpi .. -gpu_id 


so you will have 4 calculations for GPU.


Jacopo

Inviato da iPad

Il giorno 15/ago/2013, alle ore 10:56, Albert  ha scritto:

> Hello:
> 
> I've got two GTX690 GPU in a workstation, and I compiled gromacs-4.6.3 with 
> plumed and MPI support. I am trying to run some metadynamics with mdrun with 
> command:
> 
> mdrun_mpi -s md.tpr -v -g md.log -o md.trr -x md.xtc -plumed plumed2.dat -e 
> md.edr
> 
> but mdrun can only use 1 GPU as indicated in the log file:
> 
> 
> 
> 4 GPUs detected on host node3:
>  #0: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
>  #1: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
>  #2: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
>  #3: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
> 
> 
> NOTE: potentially sub-optimal launch configuration, mdrun_mpi started with 
> less
>  PP MPI process per node than GPUs available.
>  Each PP MPI process can use only one GPU, 1 GPUs per node will be used.
> 
> 1 GPU auto-selected for this run: #0
> 
> 
> 
> I am just wondering how can we use multiple GPU for such kind of jobs?
> 
> THX
> Albert
> -- 
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www interface 
> or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU + surface

2013-08-09 Thread Lucio Montero
Hello. ¿Have you removed periodicity?. Because you may only be seeing 
traversal of water molecules among copies of the periodic system.

Lucio Montero
Ph. D. student
Instituto de Biotecnologia, UNAM
Mexico


El 08/08/13 07:39, Ondrej Kroutil escribió:

Dear GMX users.
   I have done simulation of ions and water near quartz surface
(ClayFF) using GPU (GTX580) and Gromacs (4.6.1, single precision, 64
bit, SSE4.1, fftw-3.3.3) and have observed strange behavior of water
and ions. Its NVT simulation with freezed surface atoms (see .mdp
below) and negative charge on surface (deprotonated silanols), system
is overall neutral. I used same mdp for normal CPU simulation and GPU
simulation, and just added -testverlet option for GPU simulation.
   In CPU simulation ions and water behaved as expected (see
http://i1315.photobucket.com/albums/t587/Andrew_Twister/cpu-simul_zpscf784b46.png)
, but in GPU simulation there was a visible flow of ions toward image
of lower surface and all water molecules were oriented with hydrogens
facing downward and oxygens oriented upwards (see
http://i1315.photobucket.com/albums/t587/Andrew_Twister/gpu-simul_zps2c160ea6.png).
It looks like there was an applied electric field but it is not.
   Do you think there is a problem in initial setup of parameters in
mdp file? Or maybe problem of freezing groups? With no freeze
situation is better, but there is still visible flow and pairing of
same ions (see 
http://i1315.photobucket.com/albums/t587/Andrew_Twister/gpu-no_freeze_zps72ef3938.png).
   It look as electrostatics problem. Do you have any hints, please?
And sorry if I missed similar topic in mailing list, but I couldn't
find anything similar.

   Ondrej Kroutil

integrator   =  md
dt   =  0.001
nsteps   =  10
comm_mode=  linear
nstcomm  =  1000
nstxout  =  0
nstxtcout=  1000
nstvout  =  0
nstfout  =  0
nstlog   =  1000
xtc_precision=  1
nstlist  =  10
ns_type  =  grid
rlist=  1.2
coulombtype  =  PME
rcoulomb =  1.2
rvdw =  1.2
constraints  =  hbonds
constraint_algorithm =  lincs
lincs_iter   =  1
fourierspacing   =  0.1
pme_order   =  4
ewald_rtol  =  1e-5
ewald_geometry  =  3dc
optimize_fft=  yes
; Nose-Hoover temperature coupling
Tcoupl =  nose-hoover
tau_t  =  1
tc_grps=  system
ref_t  =  298.15
; No Pressure
; Pcoupl =   Parrinello-Rahman
pcoupltype  =  semiisotropic
tau_p   =  1.0
compressibility =  0 4.6e-5
ref_p   =  0 1.0
; OTHER
periodic_molecules  =  no
pbc =  xyz
;energygrps = SOL SOH
freezegrps  = BULK
freezedim   = Y Y Y
gen_vel = yes
gen_temp= 298.15
gen_seed= -1



--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


RE: [gmx-users] GPU + surface

2013-08-08 Thread Berk Hess
Hi,

The -testverlet option is only for testing (as the name implies).
Please set the mdp option cutoff-scheme = Verlet
Also please update to 4.6.3, as this, potential, issue might have already been 
resolved.
With the Verlet scheme the CPU and GPU should give the same, correct or 
incorrect, result.

Could it be that your system is located partially above and partially below z=0?
This will cause problems with ewald-geometry = 3dc. To use this option you need 
to ensure your whole system is in the same periodic image.

Cheers,

Berk

---
> Date: Thu, 8 Aug 2013 14:39:59 +0200
> From: okrou...@gmail.com
> To: gmx-users@gromacs.org
> Subject: [gmx-users] GPU + surface
>
> Dear GMX users.
> I have done simulation of ions and water near quartz surface
> (ClayFF) using GPU (GTX580) and Gromacs (4.6.1, single precision, 64
> bit, SSE4.1, fftw-3.3.3) and have observed strange behavior of water
> and ions. Its NVT simulation with freezed surface atoms (see .mdp
> below) and negative charge on surface (deprotonated silanols), system
> is overall neutral. I used same mdp for normal CPU simulation and GPU
> simulation, and just added -testverlet option for GPU simulation.
> In CPU simulation ions and water behaved as expected (see
> http://i1315.photobucket.com/albums/t587/Andrew_Twister/cpu-simul_zpscf784b46.png)
> , but in GPU simulation there was a visible flow of ions toward image
> of lower surface and all water molecules were oriented with hydrogens
> facing downward and oxygens oriented upwards (see
> http://i1315.photobucket.com/albums/t587/Andrew_Twister/gpu-simul_zps2c160ea6.png).
> It looks like there was an applied electric field but it is not.
> Do you think there is a problem in initial setup of parameters in
> mdp file? Or maybe problem of freezing groups? With no freeze
> situation is better, but there is still visible flow and pairing of
> same ions (see 
> http://i1315.photobucket.com/albums/t587/Andrew_Twister/gpu-no_freeze_zps72ef3938.png).
> It look as electrostatics problem. Do you have any hints, please?
> And sorry if I missed similar topic in mailing list, but I couldn't
> find anything similar.
>
> Ondrej Kroutil
>
> integrator = md
> dt = 0.001
> nsteps = 10
> comm_mode = linear
> nstcomm = 1000
> nstxout = 0
> nstxtcout = 1000
> nstvout = 0
> nstfout = 0
> nstlog = 1000
> xtc_precision = 1
> nstlist = 10
> ns_type = grid
> rlist = 1.2
> coulombtype = PME
> rcoulomb = 1.2
> rvdw = 1.2
> constraints = hbonds
> constraint_algorithm = lincs
> lincs_iter = 1
> fourierspacing = 0.1
> pme_order = 4
> ewald_rtol = 1e-5
> ewald_geometry = 3dc
> optimize_fft = yes
> ; Nose-Hoover temperature coupling
> Tcoupl = nose-hoover
> tau_t = 1
> tc_grps = system
> ref_t = 298.15
> ; No Pressure
> ; Pcoupl = Parrinello-Rahman
> pcoupltype = semiisotropic
> tau_p = 1.0
> compressibility = 0 4.6e-5
> ref_p = 0 1.0
> ; OTHER
> periodic_molecules = no
> pbc = xyz
> ;energygrps = SOL SOH
> freezegrps = BULK
> freezedim = Y Y Y
> gen_vel = yes
> gen_temp = 298.15
> gen_seed = -1
>
> --
> Ondřej Kroutil
> ,, Faculty of Health and Social Studies
> "))' University of South Bohemia
> OOO Jirovcova 24, Ceske Budejovice
> OOO The Czech Republic
> | OO E-mail: okrou...@gmail.com
>>-- O Mobile: +420 736 537 190
> --
> gmx-users mailing list gmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists   
>   --
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: Re: Re: [gmx-users] GPU-based workstation

2013-08-01 Thread Szilárd Páll
I may be late with the reply, but here are my 2 cents.

If you need a single very fast machine (i.e. maximum single simulation
performance), you should get
- either a very fast desktop CPU: i7 3930 or for 2x more the 3970 -
which, BTW, I think is not worth it ($600-1000)
- or 1-2 fast Xeon E5-s - depending on how many and which these will
be $1k-2k each.

For a single CPU setup two Titans may be an overkill and (at least
with the current code) you may get very little extra performance from
using two iso one GPU. With a dual-socket machine (and decently fast
CPUs), if you have a large enough input system, two GPUs will work
nicely.

However, if you care about total simulation throughput and you have
multiple simulations to run, I'd suggest that you buy 2-3 machines
with the components that give the best ns/day/$: something like
i7-4670 or 4770 with GTX 680/770 (or 780).



--
Szilárd


On Thu, Jun 27, 2013 at 1:01 PM, James Starlight  wrote:
> Back to my question
> I want to build gpu-based workstation based onto 2 titans geforces.
>
> My current budget allow me only hight-end  6nodes core i 7-3930  and MB
> with 5 PCI-E (like Asus rampage IV series). Would this system be balanced
> with two GPUs ? Should I use two 6-8 nodes XEONS instead of i7?
>
> James
>
> 2013/5/29 James Starlight 
>
>> Dear Dr. Pall!
>>
>> Thank you for your suggestions!
>>
>> Asumming that I have budget of 5000 $ and I want to build gpu-based
>> desktop on this money.
>>
>> Previously I've used single 4 core i5 with GTX 670 and obtain average 10
>> ns\day performance for the 70k atoms systems (1.0 cutoffs, no virtual sites
>> , sd integrator).
>>
>> Now I'd like to build system based on 2 hight-end GeForces (e.g like
>> TITAN).
>> Should that system include 2 cpu's for good balancing? (e.g two 6 nodes
>> XEONS with faster clocks for instance could be better for simulations than
>> i7, couldnt it?)
>>
>> What addition properties to the MB should I consider for such system ?
>>
>> James
>>
>>
>> 2013/5/28 lloyd riggs 
>>
>>> Dear Dr. Pali,
>>>
>>> Thank you,
>>>
>>> Stephan Watkins
>>>
>>> *Gesendet:* Dienstag, 28. Mai 2013 um 19:50 Uhr
>>> *Von:* "Szilárd Páll" 
>>>
>>> *An:* "Discussion list for GROMACS users" 
>>> *Betreff:* Re: Re: [gmx-users] GPU-based workstation
>>> Dear all,
>>>
>>> As far as I understand, the OP is interested in hardware for *running*
>>> GROMACS 4.6 rather than developing code. or running LINPACK.
>>>
>>>
>>> To get best performance it is important to use a machine with hardware
>>> balanced for GROMACS' workloads. Too little GPU resources will result
>>> in CPU idling; too much GPU resources will lead to the runs being CPU
>>> or multi-GPU scaling bound and above a certain level GROMACS won't be
>>> able to make use of additional GPUs.
>>>
>>> Of course, the balance will depend both on hardware and simulation
>>> settings (mostly the LJ cut-off used).
>>>
>>> An additional factor to consider is typical system size. To reach near
>>> peak pair-force throughput on GPUs you typically need >20k-40k
>>> particles/GPU (depends on the architecture) and throughput drops below
>>> these values. Hence, in most cases it is preferred to use fewer and
>>> faster GPUs rather than more.
>>>
>>> Without knowing the budgdet and indented use of the machine it is hard
>>> to make suggestions, but I would say for a budget desktop box a
>>> quad-core Intel Ivy Bridge or the top-end AMD Piledriver CPU with a
>>> fast Kepler GTX card (e.g. GTX 680 or GTX 770/780) should work well.
>>> If you're considering dual-socket workstations, I suggest you go with
>>> the higher core-count and higher frequency Intel CPUs (6+ cores >2.2
>>> GHz), otherwise you may not see as much benefit as you would expect
>>> based on the insane price tag (especially if you compare to an i7
>>> 3939K or its IVB successor).
>>>
>>> Cheers,
>>> --
>>> Szilárd
>>>
>>>
>>> On Sat, May 25, 2013 at 1:02 PM, lloyd riggs  wrote:
>>> > More RAM the better, and the best I have seen is 4 GPU work station. I
>>> can
>>> > use/have used 4. The GPU takes 2 slots though, so a 7-8 PCIe board is
>>> > really 3-4 GPU, except the tyan mentioned (there designed as blades so
>>> an 8
>>> > or 10 slot board really holds 8 or 10 GPU's). There's co

Re: [gmx-users] gpu cluster explanation

2013-07-23 Thread Francesco
Hi Richard,
Thank you for the help and sorry for the delay in my reply.
I tried some test run changing some parameters (e.g. removing PME) and I
was able to reach 20ns/day, so I think that 9-11 ns/day it's the max
that I can obtain for my setting.

thank your again for your help.

cheers,

Fra

On Fri, 12 Jul 2013, at 03:41 PM, Richard Broadbent wrote:
> 
> 
> On 12/07/13 13:26, Francesco wrote:
> > Hi all,
> > I'm working with a 200K atoms system (protein + explicit water) and
> > after a while using a cpu cluster I had to switch to a gpu cluster.
> > I read both Acceleration and parallelization and Gromacs-gpu
> > documentation pages
> > (http://www.gromacs.org/Documentation/Acceleration_and_parallelization
> > and
> > http://www.gromacs.org/Documentation/Installation_Instructions_4.5/GROMACS-OpenMM)
> > but it's a bit confusing and I need help to understand if I really have
> > understood correctly. :)
> > I have 2 type of nodes:
> > 3gpu ( NVIDIA Tesla M2090) and 2 cpu 6cores each (Intel Xeon E5649 @
> > 2.53GHz)
> > 8gpu and 2 cpu (6 cores each)
> >
> > 1) I can only have 1 MPI per gpu, meaning that with 3 gpu I can have 3
> > MPI max.
> > 2) because I have 12 cores I can open 4 OPenMP threads x MPI, because
> > 4x3= 12
> >
> > now if I have a node with 8 gpu, I can use 4 gpu:
> > 4 MPI and 3 OpenMP
> > is it right?
> > is it possible to use 8 gpu and 8 cores only?
> 
> you could set -ntomp 0, however and setup mpi/thread mpi to use 8 cores. 
> However, a system that unbalanced (huge amount of gpu power to 
> comparatively little cpu power) is unlikely to get great performance.
> >
> > Using gromacs 4.6.2 and 144 cpu cores I reach 35 ns/day, while with 3
> > gpu  and 12 cores I get 9-11 ns/day.
> >
> That slowdown is in line with what I got when I tried a similar cpu-gpu 
> setup. That said other's might have some advice that will improve your 
> performance.
> 
> > the command that I use is:
> > mdrun -dlb yes -s input_50.tpr -deffnm 306s_50 -v
> > with n° gpu set via script :
> > #BSUB -n 3
> >
> > I also tried to set -npme / -nt / -ntmpi / -ntomp, but nothing changes.
> >
> > The mdp file and some statistics are following:
> >
> >  START MDP 
> >
> > title = G6PD wt molecular dynamics (2bhl.pdb) - NPT MD
> >
> > ; Run parameters
> > integrator  = md; Algorithm options
> > nsteps  = 2500  ; maximum number of steps to
> > perform [50 ns]
> > dt  = 0.002 ; 2 fs = 0.002 ps
> >
> > ; Output control
> > nstxout= 1 ; [steps] freq to write coordinates to
> > trajectory, the last coordinates are always written
> > nstvout= 1 ; [steps] freq to write velocities to
> > trajectory, the last velocities are always written
> > nstlog  = 1 ; [steps] freq to write energies to log
> > file, the last energies are always written
> > nstenergy = 1  ; [steps] write energies to disk
> > every nstenergy steps
> > nstxtcout  = 1 ; [steps] freq to write coordinates to
> > xtc trajectory
> > xtc_precision   = 1000  ; precision to write to xtc trajectory
> > (1000 = default)
> > xtc_grps= system; which coordinate
> > group(s) to write to disk
> > energygrps  = system; or System / which energy
> > group(s) to writk
> >
> > ; Bond parameters
> > continuation= yes   ; restarting from npt
> > constraints = all-bonds ; Bond types to replace by constraints
> > constraint_algorithm= lincs ; holonomic constraints
> > lincs_iter  = 1 ; accuracy of LINCS
> > lincs_order = 4 ; also related to
> > accuracy
> > lincs_warnangle  = 30; [degrees] maximum angle that a bond can
> > rotate before LINCS will complain
> >
> 
> That seems a little loose for constraints but setting that up and 
> checking it's conserving energy and preserving bond lengths is something 
> you'll have to do yourself
> 
> Richard
> > ; Neighborsearching
> > ns_type = grid  ; method of updating neighbor list
> > cutoff-scheme = Verlet
> > nstlist = 10; [steps] frequence to update
> > neighbor list (10)
> > rlist = 1.0   ; [nm] cut-off distance for the
> > short-range neighbor list  (1 default)
> > rcoulomb  = 1.0   ; [nm] long range electrostatic cut-off
> > rvdw  = 1.0   ; [nm]  long range Van der Waals cut-off
> >
> > ; Electrostatics
> > coulombtype= PME  ; treatment of long range electrostatic
> > interactions
> > vdwtype = cut-off   ; treatment of Van der Waals
> > interactions
> >
> > ; Periodic boundary conditions
> > pbc = xyz
> >
> > ; Dispersion correction
> > DispCorr= EnerPres  ; appling long
> > range dispersion

Re: [gmx-users] gpu cluster explanation

2013-07-12 Thread Richard Broadbent



On 12/07/13 13:26, Francesco wrote:

Hi all,
I'm working with a 200K atoms system (protein + explicit water) and
after a while using a cpu cluster I had to switch to a gpu cluster.
I read both Acceleration and parallelization and Gromacs-gpu
documentation pages
(http://www.gromacs.org/Documentation/Acceleration_and_parallelization
and
http://www.gromacs.org/Documentation/Installation_Instructions_4.5/GROMACS-OpenMM)
but it's a bit confusing and I need help to understand if I really have
understood correctly. :)
I have 2 type of nodes:
3gpu ( NVIDIA Tesla M2090) and 2 cpu 6cores each (Intel Xeon E5649 @
2.53GHz)
8gpu and 2 cpu (6 cores each)

1) I can only have 1 MPI per gpu, meaning that with 3 gpu I can have 3
MPI max.
2) because I have 12 cores I can open 4 OPenMP threads x MPI, because
4x3= 12

now if I have a node with 8 gpu, I can use 4 gpu:
4 MPI and 3 OpenMP
is it right?
is it possible to use 8 gpu and 8 cores only?


you could set -ntomp 0, however and setup mpi/thread mpi to use 8 cores. 
However, a system that unbalanced (huge amount of gpu power to 
comparatively little cpu power) is unlikely to get great performance.


Using gromacs 4.6.2 and 144 cpu cores I reach 35 ns/day, while with 3
gpu  and 12 cores I get 9-11 ns/day.

That slowdown is in line with what I got when I tried a similar cpu-gpu 
setup. That said other's might have some advice that will improve your 
performance.



the command that I use is:
mdrun -dlb yes -s input_50.tpr -deffnm 306s_50 -v
with n° gpu set via script :
#BSUB -n 3

I also tried to set -npme / -nt / -ntmpi / -ntomp, but nothing changes.

The mdp file and some statistics are following:

 START MDP 

title = G6PD wt molecular dynamics (2bhl.pdb) - NPT MD

; Run parameters
integrator  = md; Algorithm options
nsteps  = 2500  ; maximum number of steps to
perform [50 ns]
dt  = 0.002 ; 2 fs = 0.002 ps

; Output control
nstxout= 1 ; [steps] freq to write coordinates to
trajectory, the last coordinates are always written
nstvout= 1 ; [steps] freq to write velocities to
trajectory, the last velocities are always written
nstlog  = 1 ; [steps] freq to write energies to log
file, the last energies are always written
nstenergy = 1  ; [steps] write energies to disk
every nstenergy steps
nstxtcout  = 1 ; [steps] freq to write coordinates to
xtc trajectory
xtc_precision   = 1000  ; precision to write to xtc trajectory
(1000 = default)
xtc_grps= system; which coordinate
group(s) to write to disk
energygrps  = system; or System / which energy
group(s) to writk

; Bond parameters
continuation= yes   ; restarting from npt
constraints = all-bonds ; Bond types to replace by constraints
constraint_algorithm= lincs ; holonomic constraints
lincs_iter  = 1 ; accuracy of LINCS
lincs_order = 4 ; also related to
accuracy
lincs_warnangle  = 30; [degrees] maximum angle that a bond can
rotate before LINCS will complain



That seems a little loose for constraints but setting that up and 
checking it's conserving energy and preserving bond lengths is something 
you'll have to do yourself


Richard

; Neighborsearching
ns_type = grid  ; method of updating neighbor list
cutoff-scheme = Verlet
nstlist = 10; [steps] frequence to update
neighbor list (10)
rlist = 1.0   ; [nm] cut-off distance for the
short-range neighbor list  (1 default)
rcoulomb  = 1.0   ; [nm] long range electrostatic cut-off
rvdw  = 1.0   ; [nm]  long range Van der Waals cut-off

; Electrostatics
coulombtype= PME  ; treatment of long range electrostatic
interactions
vdwtype = cut-off   ; treatment of Van der Waals
interactions

; Periodic boundary conditions
pbc = xyz

; Dispersion correction
DispCorr= EnerPres  ; appling long
range dispersion corrections

; Ewald
fourierspacing= 0.12; grid spacing for FFT  -
controll the higest magnitude of wave vectors (0.12)
pme_order = 4 ; interpolation order for PME, 4 = cubic
ewald_rtol= 1e-5  ; relative strength of Ewald-shifted
potential at rcoulomb

; Temperature coupling
tcoupl  = nose-hoover   ; temperature
coupling with Nose-Hoover ensemble
tc_grps = Protein Non-Protein
tau_t   = 0.40.4; [ps]
time constant
ref_t   = 310310; [K]
reference temperature for coupling [310 = 28°C

; Pressure coupling
pcoupl  = parrinello-rahman
pcoupltype= isotro

Re: Re: Re: [gmx-users] GPU-based workstation

2013-06-27 Thread James Starlight
Back to my question
I want to build gpu-based workstation based onto 2 titans geforces.

My current budget allow me only hight-end  6nodes core i 7-3930  and MB
with 5 PCI-E (like Asus rampage IV series). Would this system be balanced
with two GPUs ? Should I use two 6-8 nodes XEONS instead of i7?

James

2013/5/29 James Starlight 

> Dear Dr. Pall!
>
> Thank you for your suggestions!
>
> Asumming that I have budget of 5000 $ and I want to build gpu-based
> desktop on this money.
>
> Previously I've used single 4 core i5 with GTX 670 and obtain average 10
> ns\day performance for the 70k atoms systems (1.0 cutoffs, no virtual sites
> , sd integrator).
>
> Now I'd like to build system based on 2 hight-end GeForces (e.g like
> TITAN).
> Should that system include 2 cpu's for good balancing? (e.g two 6 nodes
> XEONS with faster clocks for instance could be better for simulations than
> i7, couldnt it?)
>
> What addition properties to the MB should I consider for such system ?
>
> James
>
>
> 2013/5/28 lloyd riggs 
>
>> Dear Dr. Pali,
>>
>> Thank you,
>>
>> Stephan Watkins
>>
>> *Gesendet:* Dienstag, 28. Mai 2013 um 19:50 Uhr
>> *Von:* "Szilárd Páll" 
>>
>> *An:* "Discussion list for GROMACS users" 
>> *Betreff:* Re: Re: [gmx-users] GPU-based workstation
>> Dear all,
>>
>> As far as I understand, the OP is interested in hardware for *running*
>> GROMACS 4.6 rather than developing code. or running LINPACK.
>>
>>
>> To get best performance it is important to use a machine with hardware
>> balanced for GROMACS' workloads. Too little GPU resources will result
>> in CPU idling; too much GPU resources will lead to the runs being CPU
>> or multi-GPU scaling bound and above a certain level GROMACS won't be
>> able to make use of additional GPUs.
>>
>> Of course, the balance will depend both on hardware and simulation
>> settings (mostly the LJ cut-off used).
>>
>> An additional factor to consider is typical system size. To reach near
>> peak pair-force throughput on GPUs you typically need >20k-40k
>> particles/GPU (depends on the architecture) and throughput drops below
>> these values. Hence, in most cases it is preferred to use fewer and
>> faster GPUs rather than more.
>>
>> Without knowing the budgdet and indented use of the machine it is hard
>> to make suggestions, but I would say for a budget desktop box a
>> quad-core Intel Ivy Bridge or the top-end AMD Piledriver CPU with a
>> fast Kepler GTX card (e.g. GTX 680 or GTX 770/780) should work well.
>> If you're considering dual-socket workstations, I suggest you go with
>> the higher core-count and higher frequency Intel CPUs (6+ cores >2.2
>> GHz), otherwise you may not see as much benefit as you would expect
>> based on the insane price tag (especially if you compare to an i7
>> 3939K or its IVB successor).
>>
>> Cheers,
>> --
>> Szilárd
>>
>>
>> On Sat, May 25, 2013 at 1:02 PM, lloyd riggs  wrote:
>> > More RAM the better, and the best I have seen is 4 GPU work station. I
>> can
>> > use/have used 4. The GPU takes 2 slots though, so a 7-8 PCIe board is
>> > really 3-4 GPU, except the tyan mentioned (there designed as blades so
>> an 8
>> > or 10 slot board really holds 8 or 10 GPU's). There's cooling problems
>> > though with GPU's, as on a board there packed, so extra cooling things
>> may
>> > help not blow a GPU, but I would look for good ones (ask around), as
>> its a
>> > video game market and they go for looks even though its in casing? The
>> > external RAM (not onboard GPU RAM) helps if you do a larger sim, but I
>> dont
>> > know performance wise, the onboard GPU, the more RAM the marrier...so
>> yes,
>> > normal work stations you can get 4 GPU's for a 300 US$ board, but then
>> the
>> > price goes way up (3-4000 US$ for an 8-10 gpu board). RAM ordered
>> abroad is
>> > also cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on tests
>> > software, not Gromacs, so would be nice to see performance...for a
>> small 100
>> > atom molecule and 500 solvent, using just the CPU I get it to run 5-10
>> > minutes real for 1 ns sim, but tried simple large 800 amino, 25,000
>> solvent
>> > eq (NVT or NPT) runs and they clock at around 1 hour real for say 50 ps
>> > eq's
>> >
>> > Stephan
>> >
>> > Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
>> > Von: &qu

Re: [gmx-users] GPU / CPU load imblance

2013-06-25 Thread Justin Lemkul



On 6/25/13 6:33 PM, Dwey wrote:

Hi gmx-users,

 I used  8-cores AMD CPU  with a GTX680 GPU [ with 1536 CUDA Cores]  to
run an example of Umbrella Sampling provided by Justin.
I am happy that GPU acceleration indeed helps me reduce significant time (
from 34 hours to 7 hours)  of computation in this example.
However, I found there was a NOTE on the screen like

++
  The GPU has >20% more load than the CPU. This imbalance causes
performance loss, consider using a shorter cut-off and a finer PME grid
  ++

Given a 20% load imbalance, I wonder if someone can give suggestions as to
how to avoid performance loss in terms of hardware (GPU/CPU)
improvement  or  the modification of  mdp file (see below).



I would avoid tweaking the .mdp settings.  There have been several reports where 
people hacked at nonbonded cutoffs to get better performance, and it resulted in 
totally useless output.  These settings are part of the force field.  Avoid 
changing them.



In terms of hardware,  dose this NOTE suggest that I should use a
higher-capacity GPU like GTX 780 [ with 2304 CUDA Cores] to balance load or
catch up speed  ?
If so,   can it help by adding  another card with  GTX 680 GPU in the same
box ?  but will it cause GPU/CPU imbalance load  again, which two GPU keep
waiting for 8-cores CPU  ?


There has been a lot of discussion on hardware, GPU/CPU balancing, etc. in 
recent days.  Please check the archive.  Some of the threads are quite detailed.


-Justin



Second,

++
Force evaluation time GPU/CPU: 4.006 ms/2.578 ms = 1.554
For optimal performance this ratio should be close to 1
++

I have no idea how this is evaluated by 4.006 ms and 2.578 ms for GPU and
CPU time, respectively.

It will be very helpful to modify  the attached mdp for a better
load balance between GPU and CPU.

I appreciate kind advice and hints to improve this mdp file.

Thanks,

Dwey

### courtesy  to  Justin #

title   = Umbrella pulling simulation
define  = -DPOSRES_B
; Run parameters
integrator  = md
dt  = 0.002
tinit   = 0
nsteps  = 500   ; 10 ns
nstcomm = 10
; Output parameters
nstxout = 5 ; every 100 ps
nstvout = 5
nstfout = 5000
nstxtcout   = 5000  ; every 10 ps
nstenergy   = 5000
; Bond parameters
constraint_algorithm= lincs
constraints = all-bonds
continuation= yes
; Single-range cutoff scheme
nstlist = 5
ns_type = grid
rlist   = 1.4
rcoulomb= 1.4
rvdw= 1.4
; PME electrostatics parameters
coulombtype = PME
fourierspacing  = 0.12
fourier_nx  = 0
fourier_ny  = 0
fourier_nz  = 0
pme_order   = 4
ewald_rtol  = 1e-5
optimize_fft= yes
; Berendsen temperature coupling is on in two groups
Tcoupl  = Nose-Hoover
tc_grps = Protein   Non-Protein
tau_t   = 0.5   0.5
ref_t   = 310   310
; Pressure coupling is on
Pcoupl  = Parrinello-Rahman
pcoupltype  = isotropic
tau_p   = 1.0
compressibility = 4.5e-5
ref_p   = 1.0
refcoord_scaling = com
; Generate velocities is off
gen_vel = no
; Periodic boundary conditions are on in all directions
pbc = xyz
; Long-range dispersion correction
DispCorr= EnerPres
cutoff-scheme   = Verlet
; Pull code
pull= umbrella
pull_geometry   = distance
pull_dim= N N Y
pull_start  = yes
pull_ngroups= 1
pull_group0 = Chain_B
pull_group1 = Chain_A
pull_init1  = 0
pull_rate1  = 0.0
pull_k1 = 1000  ; kJ mol^-1 nm^-2
pull_nstxout= 1000  ; every 2 ps
pull_nstfout= 1000  ; every 2 ps



--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU ECC question

2013-06-09 Thread Szilárd Páll
On Sat, Jun 8, 2013 at 9:21 PM, Albert  wrote:
> Hello:
>
>  Recently I found a strange question about Gromacs-4.6.2 on GPU workstaion.
> In my GTX690 machine, when I run md production I found that the ECC is on.
> However, in my another GTX590 machine, I found the ECC was off:
>
> 4 GPUs detected:
>   #0: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
>   #1: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
>   #2: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
>   #3: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
>
> moreover, there is only two GTX590 in the machine, I don't know why Gromacs
> claimed 4 GPU detected. However, in my another Linux machine which also have
> two GTX590, Gromacs-4.6.2 only find 2 GPU, and ECC is still off.
>
> I am just wondering:
>
> (1) why in GTX690 the ECC can be on while it is off in my GTX590? I compiled
> Gromacs with the same options and the same version of intel compiler

Unless your 690 is in fact a Tesla K10 it does surely not support ECC!
Note that ECC is not something I personally think you really need.

>
> (2) why in machines both of physically installed two GTX590 cards, one of
> them was detected with 4 GPU while the other was claimed contains two GPU?
>

Both GTX 590 and 690 are dual-chip boards which means two independent
processing units with their own memory mounted on the same card and
connected by a PCI switch (NVIDIA NF200). Hence, the two GPUs on these
dual-chip boards will be enumerated as a separate devices. You can
double-check this in nvidia-smi which should give the same devices as
what mdrun reports. I suspect that one of the GPUs which is shown to
have only two GPUs suffers from some hardware or software issues.


Regards,
Szilard

> thank you very much
>
> best
> Albert
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Aw: Re: [gmx-users] GPU problem

2013-06-04 Thread lloyd riggs
 

Thanks, thats exact what I was looking for.

 

Stephan


Gesendet: Dienstag, 04. Juni 2013 um 22:28 Uhr
Von: "Justin Lemkul" 
An: "Discussion list for GROMACS users" 
Betreff: Re: [gmx-users] GPU problem



On 6/4/13 3:52 PM, lloyd riggs wrote:
> Dear All or anyone,
> A stupid question. Is there an script anyone knows of to convert a 53a6ff from
> .top redirects to the gromacs/top directory to something like a ligand .itp?
> This is usefull at the moment. Example:
> [bond]
> 6 7 2 gb_5
> to
> [bonds]
> ; ai aj fu c0, c1, ...
> 6 7 2 0.139 1080.0 0.139 1080.0 ; C CH
> for everything (a protein/DNA complex) inclusive of angles, dihedrials?
> Ive been playing with some of the gromacs user supplied files, but nothing yet.

Sounds like something grompp -pp should take care of.

-Justin

--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing list gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists



-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] GPU problem

2013-06-04 Thread Justin Lemkul



On 6/4/13 3:52 PM, lloyd riggs wrote:

Dear All or anyone,
A stupid question.  Is there an script anyone knows of to convert a 53a6ff from
.top redirects to the gromacs/top directory to something like a ligand .itp?
This is usefull at the moment.  Example:
[bond]
 6 7 2gb_5
to
[bonds]
; ai  aj  fuc0, c1, ...
   6  7   20.139  1080.00.139  1080.0 ;   C  CH
for everything (a protein/DNA complex) inclusive of angles, dihedrials?
Ive been playing with some of the gromacs user supplied files, but nothing yet.


Sounds like something grompp -pp should take care of.

-Justin

--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


RE:[gmx-users] GPU problem

2013-06-04 Thread lloyd riggs
 

Dear All or anyone,

 

A stupid question.  Is there an script anyone knows of to convert a 53a6ff from .top redirects to the gromacs/top directory to something like a ligand .itp?  This is usefull at the moment.  Example:

 

[bond]

    6 7 2    gb_5

 

to

 

[bonds]

; ai  aj  fu    c0, c1, ...

  6  7   2    0.139  1080.0    0.139  1080.0 ;   C  CH  

 

for everything (a protein/DNA complex) inclusive of angles, dihedrials?

 

Ive been playing with some of the gromacs user supplied files, but nothing yet.

 

Stephan Watkins
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] GPU problem

2013-06-04 Thread Szilárd Páll
"-nt" is mostly a backward compatibility option and sets the total
number of threads (per rank). Instead, you should set both "-ntmpi"
(or -np with MPI) and "-ntomp". However, note that unless a single
mdrun uses *all* cores/hardware threads on a node, it won't pin the
threads to cores. Failing to pin threads will lead to considerable
performance degradation; just tried and depending on how (un)lucky the
thread placement and migration is, I get 1.5-2x performance
degradation with running two mdrun-s on a single dual-socket node
without pining threads.

My advise is (yet again) that you should check the
http://www.gromacs.org/Documentation/Acceleration_and_parallelization
wiki page, in particular the section on how to run simulations. If
things are not, clear please ask for clarification - input and
constructive criticism should help us improve the wiki.

We have been patiently pointing everyone to the wiki, so asking
without reading up first is neither productive nor really fair.

Cheers,
--
Szilárd


On Tue, Jun 4, 2013 at 11:22 AM, Chandan Choudhury  wrote:
> Hi Albert,
>
> I think using -nt flag (-nt=16) with mdrun would solve your problem.
>
> Chandan
>
>
> --
> Chandan kumar Choudhury
> NCL, Pune
> INDIA
>
>
> On Tue, Jun 4, 2013 at 12:56 PM, Albert  wrote:
>
>> Dear:
>>
>>  I've got four GPU in one workstation. I am trying to run two GPU job with
>> command:
>>
>> mdrun -s md.tpr -gpu_id 01
>> mdrun -s md.tpr -gpu_id 23
>>
>> there are 32 CPU in this workstation. I found that each job trying to use
>> the whole CPU, and there are 64 sub job when these two GPU mdrun submitted.
>>  Moreover, one of the job stopped after short of running, probably because
>> of the CPU issue.
>>
>> I am just wondering, how can we distribute CPU when we run two GPU job in
>> a single workstation?
>>
>> thank you very much
>>
>> best
>> Albert
>> --
>> gmx-users mailing listgmx-users@gromacs.org
>> http://lists.gromacs.org/**mailman/listinfo/gmx-users
>> * Please search the archive at http://www.gromacs.org/**
>> Support/Mailing_Lists/Searchbefore
>>  posting!
>> * Please don't post (un)subscribe requests to the list. Use the www
>> interface or send it to gmx-users-requ...@gromacs.org.
>> * Can't post? Read 
>> http://www.gromacs.org/**Support/Mailing_Lists
>>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at 
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU problem

2013-06-04 Thread Albert

On 06/04/2013 11:22 AM, Chandan Choudhury wrote:

Hi Albert,

I think using -nt flag (-nt=16) with mdrun would solve your problem.

Chandan



thank you so much.

it works well now.

ALBERT
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU problem

2013-06-04 Thread Chandan Choudhury
Hi Albert,

I think using -nt flag (-nt=16) with mdrun would solve your problem.

Chandan


--
Chandan kumar Choudhury
NCL, Pune
INDIA


On Tue, Jun 4, 2013 at 12:56 PM, Albert  wrote:

> Dear:
>
>  I've got four GPU in one workstation. I am trying to run two GPU job with
> command:
>
> mdrun -s md.tpr -gpu_id 01
> mdrun -s md.tpr -gpu_id 23
>
> there are 32 CPU in this workstation. I found that each job trying to use
> the whole CPU, and there are 64 sub job when these two GPU mdrun submitted.
>  Moreover, one of the job stopped after short of running, probably because
> of the CPU issue.
>
> I am just wondering, how can we distribute CPU when we run two GPU job in
> a single workstation?
>
> thank you very much
>
> best
> Albert
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/**mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Searchbefore
>  posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read 
> http://www.gromacs.org/**Support/Mailing_Lists
>
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: Re: Re: [gmx-users] GPU-based workstation

2013-05-28 Thread James Starlight
Dear Dr. Pall!

Thank you for your suggestions!

Asumming that I have budget of 5000 $ and I want to build gpu-based desktop
on this money.

Previously I've used single 4 core i5 with GTX 670 and obtain average 10
ns\day performance for the 70k atoms systems (1.0 cutoffs, no virtual sites
, sd integrator).

Now I'd like to build system based on 2 hight-end GeForces (e.g like TITAN).
Should that system include 2 cpu's for good balancing? (e.g two 6 nodes
XEONS with faster clocks for instance could be better for simulations than
i7, couldnt it?)

What addition properties to the MB should I consider for such system ?

James

2013/5/28 lloyd riggs 

> Dear Dr. Pali,
>
> Thank you,
>
> Stephan Watkins
>
> *Gesendet:* Dienstag, 28. Mai 2013 um 19:50 Uhr
> *Von:* "Szilárd Páll" 
>
> *An:* "Discussion list for GROMACS users" 
> *Betreff:* Re: Re: [gmx-users] GPU-based workstation
> Dear all,
>
> As far as I understand, the OP is interested in hardware for *running*
> GROMACS 4.6 rather than developing code. or running LINPACK.
>
>
> To get best performance it is important to use a machine with hardware
> balanced for GROMACS' workloads. Too little GPU resources will result
> in CPU idling; too much GPU resources will lead to the runs being CPU
> or multi-GPU scaling bound and above a certain level GROMACS won't be
> able to make use of additional GPUs.
>
> Of course, the balance will depend both on hardware and simulation
> settings (mostly the LJ cut-off used).
>
> An additional factor to consider is typical system size. To reach near
> peak pair-force throughput on GPUs you typically need >20k-40k
> particles/GPU (depends on the architecture) and throughput drops below
> these values. Hence, in most cases it is preferred to use fewer and
> faster GPUs rather than more.
>
> Without knowing the budgdet and indented use of the machine it is hard
> to make suggestions, but I would say for a budget desktop box a
> quad-core Intel Ivy Bridge or the top-end AMD Piledriver CPU with a
> fast Kepler GTX card (e.g. GTX 680 or GTX 770/780) should work well.
> If you're considering dual-socket workstations, I suggest you go with
> the higher core-count and higher frequency Intel CPUs (6+ cores >2.2
> GHz), otherwise you may not see as much benefit as you would expect
> based on the insane price tag (especially if you compare to an i7
> 3939K or its IVB successor).
>
> Cheers,
> --
> Szilárd
>
>
> On Sat, May 25, 2013 at 1:02 PM, lloyd riggs  wrote:
> > More RAM the better, and the best I have seen is 4 GPU work station. I
> can
> > use/have used 4. The GPU takes 2 slots though, so a 7-8 PCIe board is
> > really 3-4 GPU, except the tyan mentioned (there designed as blades so
> an 8
> > or 10 slot board really holds 8 or 10 GPU's). There's cooling problems
> > though with GPU's, as on a board there packed, so extra cooling things
> may
> > help not blow a GPU, but I would look for good ones (ask around), as its
> a
> > video game market and they go for looks even though its in casing? The
> > external RAM (not onboard GPU RAM) helps if you do a larger sim, but I
> dont
> > know performance wise, the onboard GPU, the more RAM the marrier...so
> yes,
> > normal work stations you can get 4 GPU's for a 300 US$ board, but then
> the
> > price goes way up (3-4000 US$ for an 8-10 gpu board). RAM ordered abroad
> is
> > also cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on tests
> > software, not Gromacs, so would be nice to see performance...for a small
> 100
> > atom molecule and 500 solvent, using just the CPU I get it to run 5-10
> > minutes real for 1 ns sim, but tried simple large 800 amino, 25,000
> solvent
> > eq (NVT or NPT) runs and they clock at around 1 hour real for say 50 ps
> > eq's
> >
> > Stephan
> >
> > Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
> > Von: "James Starlight" 
> > An: "Discussion list for GROMACS users" 
> > Betreff: Re: [gmx-users] GPU-based workstation
> > Dear Dr. Watkins!
> >
> > Thank you for the suggestions!
> >
> > In the local shops I've found only Core i7 with 6 cores (like Core
> > i7-39xx) and 4 cores. Should I obtain much better performance with 6
> cores
> > than with 4 cores in case of i7 cpu (assuming that I run simulation in
> > cpu+gpu mode )?
> >
> > Also you've mentioned about 4 PCeI MD. Does it means that modern
> > work-station could have 4 GPU's in one home-like desktop ? According to
> my
> > current task I suppose that 2 GPU's would be suitable 

Aw: Re: Re: [gmx-users] GPU-based workstation

2013-05-28 Thread lloyd riggs

Dear Dr. Pali,

 

Thank you,

 

Stephan Watkins

 

Gesendet: Dienstag, 28. Mai 2013 um 19:50 Uhr
Von: "Szilárd Páll" 
An: "Discussion list for GROMACS users" 
Betreff: Re: Re: [gmx-users] GPU-based workstation

Dear all,

As far as I understand, the OP is interested in hardware for *running*
GROMACS 4.6 rather than developing code. or running LINPACK.


To get best performance it is important to use a machine with hardware
balanced for GROMACS' workloads. Too little GPU resources will result
in CPU idling; too much GPU resources will lead to the runs being CPU
or multi-GPU scaling bound and above a certain level GROMACS won't be
able to make use of additional GPUs.

Of course, the balance will depend both on hardware and simulation
settings (mostly the LJ cut-off used).

An additional factor to consider is typical system size. To reach near
peak pair-force throughput on GPUs you typically need >20k-40k
particles/GPU (depends on the architecture) and throughput drops below
these values. Hence, in most cases it is preferred to use fewer and
faster GPUs rather than more.

Without knowing the budgdet and indented use of the machine it is hard
to make suggestions, but I would say for a budget desktop box a
quad-core Intel Ivy Bridge or the top-end AMD Piledriver CPU with a
fast Kepler GTX card (e.g. GTX 680 or GTX 770/780) should work well.
If you're considering dual-socket workstations, I suggest you go with
the higher core-count and higher frequency Intel CPUs (6+ cores >2.2
GHz), otherwise you may not see as much benefit as you would expect
based on the insane price tag (especially if you compare to an i7
3939K or its IVB successor).

Cheers,
--
Szilárd


On Sat, May 25, 2013 at 1:02 PM, lloyd riggs  wrote:
> More RAM the better, and the best I have seen is 4 GPU work station. I can
> use/have used 4. The GPU takes 2 slots though, so a 7-8 PCIe board is
> really 3-4 GPU, except the tyan mentioned (there designed as blades so an 8
> or 10 slot board really holds 8 or 10 GPU's). There's cooling problems
> though with GPU's, as on a board there packed, so extra cooling things may
> help not blow a GPU, but I would look for good ones (ask around), as its a
> video game market and they go for looks even though its in casing? The
> external RAM (not onboard GPU RAM) helps if you do a larger sim, but I dont
> know performance wise, the onboard GPU, the more RAM the marrier...so yes,
> normal work stations you can get 4 GPU's for a 300 US$ board, but then the
> price goes way up (3-4000 US$ for an 8-10 gpu board). RAM ordered abroad is
> also cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on tests
> software, not Gromacs, so would be nice to see performance...for a small 100
> atom molecule and 500 solvent, using just the CPU I get it to run 5-10
> minutes real for 1 ns sim, but tried simple large 800 amino, 25,000 solvent
> eq (NVT or NPT) runs and they clock at around 1 hour real for say 50 ps
> eq's
>
> Stephan
>
> Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
> Von: "James Starlight" 
> An: "Discussion list for GROMACS users" 
> Betreff: Re: [gmx-users] GPU-based workstation
> Dear Dr. Watkins!
>
> Thank you for the suggestions!
>
> In the local shops I've found only Core i7 with 6 cores (like Core
> i7-39xx) and 4 cores. Should I obtain much better performance with 6 cores
> than with 4 cores in case of i7 cpu (assuming that I run simulation in
> cpu+gpu mode )?
>
> Also you've mentioned about 4 PCeI MD. Does it means that modern
> work-station could have 4 GPU's in one home-like desktop ? According to my
> current task I suppose that 2 GPU's would be suitable for my simulations
> (assuming that I use typical ASUS MB and 650 Watt power unit). Have
> someone tried to use several GPU's on one workstation ? What attributes of
> MB should be taken into account for best performance on such multi-gpu
> station ?
>
> James
>
> 2013/5/25 lloyd riggs 
>
>> There's also these, but 1 chip runs 6K US, they can get performance up to
>> 2.3 teraflops per chip though double percission...but have no clue about
>> integration with GPU's...Intell also sells their chips on PCIe cards...but
>> get only about 350 Gflops, and run 1K US$.
>>
>> http://en.wikipedia.org/wiki/Field-programmable_gate_array and vendor
>> http://www.xilinx.com/
>>
>> They can design them though to fit a PCIe slot and run about the same, but
>> still need the board, ram etc...
>>
>> Mostly just to dream about, they say you can order them with radiation
>> shielding as well...so...
>>
>> Stephan Watkins
>>
>> *Gesendet:* Freitag, 24. Mai 2013 um 13:17 Uhr
>> *Von:* "J

Re: Aw: Re: [gmx-users] GPU-based workstation

2013-05-28 Thread Szilárd Páll
On Sat, May 25, 2013 at 2:16 PM, Broadbent, Richard
 wrote:
> I've been running on my Universities GPU nodes these are one E5-xeon (6-cores 
> 12 threads)  and have 4 Nvidia 690gtx's. My system is 93 000 atoms of DMF 
> under NVE.  The performance has been a little disappointing

That sounds like a very imbalanced system for GROMACS, you have
essentially 8 GPUs with rather poor PCI-E performance (a board share a
single PCI-E bus) and only 12 CPU cores to "drive" the simulation.

~10ns/day. On my home system using a core i5-2500 and a nvidia 560ti I
get 5.4ns/day for the same system. On our HPC system using 32 nodes
each with 2 quad-core xeon processors I get 30-40ns/day.

That sounds somewhat low if these are all moderately fast CPUs and GPUs.

> I think that to achieve reasonable performance the system has to be balanced 
> between CPU's and GPU's probably getting 2 high end GPU's and a top end xeon 
> E5 or core i7 would be a good choice.

Indeed. Even two GPUs may be too much - unless the CPU in question is
a very high end i7 or E5.

Cheers,
--
Szilárd

>
>
> Richard
>
> From: lloyd riggs mailto:lloyd.ri...@gmx.ch>>
> Reply-To: Discussion users 
> mailto:gmx-users@gromacs.org>>
> Date: Saturday, 25 May 2013 12:02
> To: Discussion users mailto:gmx-users@gromacs.org>>
> Subject: Aw: Re: [gmx-users] GPU-based workstation
>
> More RAM the better, and the best I have seen is 4 GPU work station.  I can 
> use/have used 4.  The GPU takes 2 slots though, so a 7-8 PCIe board is really 
> 3-4 GPU, except the tyan mentioned (there designed as blades so an 8 or 10 
> slot board really holds 8 or 10 GPU's).  There's cooling problems though with 
> GPU's, as on a board there packed, so extra cooling things may help not blow 
> a GPU, but I would look for good ones (ask around), as its a video game 
> market and they go for looks even though its in casing?  The external RAM 
> (not onboard GPU RAM) helps if you do a larger sim, but I dont know 
> performance wise, the onboard GPU, the more RAM the marrier...so yes, normal 
> work stations you can get 4 GPU's for a 300 US$ board, but then the price 
> goes way up (3-4000 US$ for an 8-10 gpu board).  RAM ordered abroad is also 
> cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on tests software, 
> not Gromacs, so would be nice to see performance...for a small 100 atom 
> molecule and 500 solvent, using just the CPU I get it to run 5-10 minutes 
> real  for 1 ns sim, but tried simple large 800 amino, 25,000 solvent eq (NVT 
> or NPT) runs and they clock at around 1 hour real for say 50 ps eq's
>
> Stephan
>
> Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
> Von: "James Starlight" mailto:jmsstarli...@gmail.com>>
> An: "Discussion list for GROMACS users" 
> mailto:gmx-users@gromacs.org>>
> Betreff: Re: [gmx-users] GPU-based workstation
> Dear Dr. Watkins!
>
> Thank you for the suggestions!
>
> In the local shops I've found only Core i7 with 6 cores (like Core
> i7-39xx) and 4 cores. Should I obtain much better performance with 6 cores
> than with 4 cores in case of i7 cpu (assuming that I run simulation in
> cpu+gpu mode )?
>
> Also you've mentioned about 4 PCeI MD. Does it means that modern
> work-station could have 4 GPU's in one home-like desktop ? According to my
> current task I suppose that 2 GPU's would be suitable for my simulations
> (assuming that I use typical ASUS MB and 650 Watt power unit). Have
> someone tried to use several GPU's on one workstation ? What attributes of
> MB should be taken into account for best performance on such multi-gpu
> station ?
>
> James
>
> 2013/5/25 lloyd riggs mailto:lloyd.ri...@gmx.ch>>
>
>> There's also these, but 1 chip runs 6K US, they can get performance up to
>> 2.3 teraflops per chip though double percission...but have no clue about
>> integration with GPU's...Intell also sells their chips on PCIe cards...but
>> get only about 350 Gflops, and run 1K US$.
>>
>> http://en.wikipedia.org/wiki/Field-programmable_gate_array and vendor
>> http://www.xilinx.com/
>>
>> They can design them though to fit a PCIe slot and run about the same, but
>> still need the board, ram etc...
>>
>> Mostly just to dream about, they say you can order them with radiation
>> shielding as well...so...
>>
>> Stephan Watkins
>>
>> *Gesendet:* Freitag, 24. Mai 2013 um 13:17 Uhr
>> *Von:* "James Starlight" 
>> mailto:jmsstarli...@gmail.com>>
>> *An:* "Discussion list for GROMACS users" 
>> mailto:gmx-users@gromacs.org>>
>> *Betreff:* [gmx

Re: Re: [gmx-users] GPU-based workstation

2013-05-28 Thread Szilárd Páll
Dear all,

As far as I understand, the OP is interested in hardware for *running*
GROMACS 4.6 rather than developing code. or running LINPACK.


To get best performance it is important to use a machine with hardware
balanced for GROMACS' workloads. Too little GPU resources will result
in CPU idling; too much GPU resources will lead to the runs being CPU
or multi-GPU scaling bound and above a certain level GROMACS won't be
able to make use of additional GPUs.

Of course, the balance will depend both on hardware and simulation
settings (mostly the LJ cut-off used).

An additional factor to consider is typical system size. To reach near
peak pair-force throughput on GPUs you typically need >20k-40k
particles/GPU (depends on the architecture) and throughput drops below
these values. Hence, in most cases it is preferred to use fewer and
faster GPUs rather than more.

Without knowing the budgdet and indented use of the machine it is hard
to make suggestions, but I would say for a budget desktop box a
quad-core Intel Ivy Bridge or the top-end AMD Piledriver CPU with a
fast Kepler GTX card (e.g. GTX 680 or GTX 770/780) should work well.
If you're considering dual-socket workstations, I suggest you go with
the higher core-count and higher frequency Intel CPUs (6+ cores >2.2
GHz), otherwise you may not see as much benefit as you would expect
based on the insane price tag (especially if you compare to an i7
3939K or its IVB successor).

Cheers,
--
Szilárd


On Sat, May 25, 2013 at 1:02 PM, lloyd riggs  wrote:
> More RAM the better, and the best I have seen is 4 GPU work station.  I can
> use/have used 4.  The GPU takes 2 slots though, so a 7-8 PCIe board is
> really 3-4 GPU, except the tyan mentioned (there designed as blades so an 8
> or 10 slot board really holds 8 or 10 GPU's).  There's cooling problems
> though with GPU's, as on a board there packed, so extra cooling things may
> help not blow a GPU, but I would look for good ones (ask around), as its a
> video game market and they go for looks even though its in casing?  The
> external RAM (not onboard GPU RAM) helps if you do a larger sim, but I dont
> know performance wise, the onboard GPU, the more RAM the marrier...so yes,
> normal work stations you can get 4 GPU's for a 300 US$ board, but then the
> price goes way up (3-4000 US$ for an 8-10 gpu board).  RAM ordered abroad is
> also cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on tests
> software, not Gromacs, so would be nice to see performance...for a small 100
> atom molecule and 500 solvent, using just the CPU I get it to run 5-10
> minutes real  for 1 ns sim, but tried simple large 800 amino, 25,000 solvent
> eq (NVT or NPT) runs and they clock at around 1 hour real for say 50 ps
> eq's
>
> Stephan
>
> Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
> Von: "James Starlight" 
> An: "Discussion list for GROMACS users" 
> Betreff: Re: [gmx-users] GPU-based workstation
> Dear Dr. Watkins!
>
> Thank you for the suggestions!
>
> In the local shops I've found only Core i7 with 6 cores (like Core
> i7-39xx) and 4 cores. Should I obtain much better performance with 6 cores
> than with 4 cores in case of i7 cpu (assuming that I run simulation in
> cpu+gpu mode )?
>
> Also you've mentioned about 4 PCeI MD. Does it means that modern
> work-station could have 4 GPU's in one home-like desktop ? According to my
> current task I suppose that 2 GPU's would be suitable for my simulations
> (assuming that I use typical ASUS MB and 650 Watt power unit). Have
> someone tried to use several GPU's on one workstation ? What attributes of
> MB should be taken into account for best performance on such multi-gpu
> station ?
>
> James
>
> 2013/5/25 lloyd riggs 
>
>> There's also these, but 1 chip runs 6K US, they can get performance up to
>> 2.3 teraflops per chip though double percission...but have no clue about
>> integration with GPU's...Intell also sells their chips on PCIe cards...but
>> get only about 350 Gflops, and run 1K US$.
>>
>> http://en.wikipedia.org/wiki/Field-programmable_gate_array and vendor
>> http://www.xilinx.com/
>>
>> They can design them though to fit a PCIe slot and run about the same, but
>> still need the board, ram etc...
>>
>> Mostly just to dream about, they say you can order them with radiation
>> shielding as well...so...
>>
>> Stephan Watkins
>>
>> *Gesendet:* Freitag, 24. Mai 2013 um 13:17 Uhr
>> *Von:* "James Starlight" 
>> *An:* "Discussion list for GROMACS users" 
>> *Betreff:* [gmx-users] GPU-based workstation
>> Dear Gromacs Users!
>>
>>
>> I'd like to build new workstation 

Re: Re: Re: [gmx-users] GPU-based workstation

2013-05-27 Thread James Starlight
On Nvidia benchmarks I've found suggestions of using of the two 6 cores CPU
for systems with the 2 GPU.

Assuming that I'll be using two 680 GTX cards with 256 bits and 4gb ram
(not a profesional nvidia cards like TESLA)
what CPU's could give me the best performance- 1 i7 of 8 cores
or 2 Xeons e5 with 6 cores ? Does it meaningful to use 2 separate CPU's
with several nodes each for the 2 GPU's ?

James

2013/5/26 lloyd riggs 

>
> You can also look at profilling on varied web sites, the high end Nvidia
> run only slightly better than the 2 year old ones, from an individual point
> not worth the money yet, but if you have the money? as I've been browsing.
>
> Also, the sim I did on the cluster was 180-190,000 atoms so the exact same
> performance the other person had.
>
> Stephan
>  *Gesendet:* Samstag, 25. Mai 2013 um 15:19 Uhr
> *Von:* "James Starlight" 
>
> *An:* "Discussion list for GROMACS users" 
> *Betreff:* Re: Aw: Re: [gmx-users] GPU-based workstation
> Richard,
>
> thanks for suggestion!
>
> Assuming that I'm using 2 high end GeForce's what performance be better
>
> 1) in case of one i7 (4 or 6 nodes ) ?
>
> 2) in case of 8 core Xeon like CPU Intel Xeon E5-2650 2.0 GHz / 8core
>
> What properties of MB should take into account primarily for such
> Xenon-based system. Does such MBs support multi-GPU ( I noticed that many
> such MBs lack for PCI)?
>
> James
>
> 2013/5/25 Broadbent, Richard 
>
> > I've been running on my Universities GPU nodes these are one E5-xeon
> > (6-cores 12 threads) and have 4 Nvidia 690gtx's. My system is 93 000
> atoms
> > of DMF under NVE. The performance has been a little disappointing
> > ~10ns/day. On my home system using a core i5-2500 and a nvidia 560ti I
> get
> > 5.4ns/day for the same system. On our HPC system using 32 nodes each
> with 2
> > quad-core xeon processors I get 30-40ns/day.
> >
> > I think that to achieve reasonable performance the system has to be
> > balanced between CPU's and GPU's probably getting 2 high end GPU's and a
> > top end xeon E5 or core i7 would be a good choice.
> >
> >
> > Richard
> >
> > From: lloyd riggs mailto:lloyd.ri...@gmx.ch>>
> > Reply-To: Discussion users  > gmx-users@gromacs.org>>
> > Date: Saturday, 25 May 2013 12:02
> > To: Discussion users mailto:gmx-users@gromacs.org
> >>
> > Subject: Aw: Re: [gmx-users] GPU-based workstation
> >
> > More RAM the better, and the best I have seen is 4 GPU work station. I
> > can use/have used 4. The GPU takes 2 slots though, so a 7-8 PCIe board is
> > really 3-4 GPU, except the tyan mentioned (there designed as blades so
> an 8
> > or 10 slot board really holds 8 or 10 GPU's). There's cooling problems
> > though with GPU's, as on a board there packed, so extra cooling things
> may
> > help not blow a GPU, but I would look for good ones (ask around), as its
> a
> > video game market and they go for looks even though its in casing? The
> > external RAM (not onboard GPU RAM) helps if you do a larger sim, but I
> dont
> > know performance wise, the onboard GPU, the more RAM the marrier...so
> yes,
> > normal work stations you can get 4 GPU's for a 300 US$ board, but then
> the
> > price goes way up (3-4000 US$ for an 8-10 gpu board). RAM ordered abroad
> > is also cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on
> tests
> > software, not Gromacs, so would be nice to see performance...for a small
> > 100 atom molecule and 500 solvent, using just the CPU I get it to run
> 5-10
> > minutes real for 1 ns sim, but tried simple large 800 amino, 25,000
> > solvent eq (NVT or NPT) runs and they clock at around 1 hour real for say
> > 50 ps eq's
> >
> > Stephan
> >
> > Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
> > Von: "James Starlight"  > jmsstarli...@gmail.com>>
> > An: "Discussion list for GROMACS users"  > gmx-users@gromacs.org>>
> > Betreff: Re: [gmx-users] GPU-based workstation
> > Dear Dr. Watkins!
> >
> > Thank you for the suggestions!
> >
> > In the local shops I've found only Core i7 with 6 cores (like Core
> > i7-39xx) and 4 cores. Should I obtain much better performance with 6
> cores
> > than with 4 cores in case of i7 cpu (assuming that I run simulation in
> > cpu+gpu mode )?
> >
> > Also you've mentioned about 4 PCeI MD. Does it means that modern
> > work-station could have 4 GPU's in one home-like desktop ? According to
>

Aw: Re: Re: [gmx-users] GPU-based workstation

2013-05-25 Thread lloyd riggs
 

You can also look at profilling on varied web sites, the high end Nvidia run only slightly better than the 2 year old ones, from an individual point not worth the money yet, but if you have the money? as I've been browsing.

 

Also, the sim I did on the cluster was 180-190,000 atoms so the exact same performance the other person had.

 

Stephan


Gesendet: Samstag, 25. Mai 2013 um 15:19 Uhr
Von: "James Starlight" 
An: "Discussion list for GROMACS users" 
Betreff: Re: Aw: Re: [gmx-users] GPU-based workstation

Richard,

thanks for suggestion!

Assuming that I'm using 2 high end GeForce's what performance be better

1) in case of one i7 (4 or 6 nodes ) ?

2) in case of 8 core Xeon like CPU Intel Xeon E5-2650 2.0 GHz / 8core

What properties of MB should take into account primarily for such
Xenon-based system. Does such MBs support multi-GPU ( I noticed that many
such MBs lack for PCI)?

James

2013/5/25 Broadbent, Richard 

> I've been running on my Universities GPU nodes these are one E5-xeon
> (6-cores 12 threads) and have 4 Nvidia 690gtx's. My system is 93 000 atoms
> of DMF under NVE. The performance has been a little disappointing
> ~10ns/day. On my home system using a core i5-2500 and a nvidia 560ti I get
> 5.4ns/day for the same system. On our HPC system using 32 nodes each with 2
> quad-core xeon processors I get 30-40ns/day.
>
> I think that to achieve reasonable performance the system has to be
> balanced between CPU's and GPU's probably getting 2 high end GPU's and a
> top end xeon E5 or core i7 would be a good choice.
>
>
> Richard
>
> From: lloyd riggs >
> Reply-To: Discussion users 
> gmx-users@gromacs.org>>
> Date: Saturday, 25 May 2013 12:02
> To: Discussion users >
> Subject: Aw: Re: [gmx-users] GPU-based workstation
>
> More RAM the better, and the best I have seen is 4 GPU work station. I
> can use/have used 4. The GPU takes 2 slots though, so a 7-8 PCIe board is
> really 3-4 GPU, except the tyan mentioned (there designed as blades so an 8
> or 10 slot board really holds 8 or 10 GPU's). There's cooling problems
> though with GPU's, as on a board there packed, so extra cooling things may
> help not blow a GPU, but I would look for good ones (ask around), as its a
> video game market and they go for looks even though its in casing? The
> external RAM (not onboard GPU RAM) helps if you do a larger sim, but I dont
> know performance wise, the onboard GPU, the more RAM the marrier...so yes,
> normal work stations you can get 4 GPU's for a 300 US$ board, but then the
> price goes way up (3-4000 US$ for an 8-10 gpu board). RAM ordered abroad
> is also cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on tests
> software, not Gromacs, so would be nice to see performance...for a small
> 100 atom molecule and 500 solvent, using just the CPU I get it to run 5-10
> minutes real for 1 ns sim, but tried simple large 800 amino, 25,000
> solvent eq (NVT or NPT) runs and they clock at around 1 hour real for say
> 50 ps eq's....
>
> Stephan
>
> Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
> Von: "James Starlight" 
> jmsstarli...@gmail.com>>
> An: "Discussion list for GROMACS users" 
> gmx-users@gromacs.org>>
> Betreff: Re: [gmx-users] GPU-based workstation
> Dear Dr. Watkins!
>
> Thank you for the suggestions!
>
> In the local shops I've found only Core i7 with 6 cores (like Core
> i7-39xx) and 4 cores. Should I obtain much better performance with 6 cores
> than with 4 cores in case of i7 cpu (assuming that I run simulation in
> cpu+gpu mode )?
>
> Also you've mentioned about 4 PCeI MD. Does it means that modern
> work-station could have 4 GPU's in one home-like desktop ? According to my
> current task I suppose that 2 GPU's would be suitable for my simulations
> (assuming that I use typical ASUS MB and 650 Watt power unit). Have
> someone tried to use several GPU's on one workstation ? What attributes of
> MB should be taken into account for best performance on such multi-gpu
> station ?
>
> James
>
> 2013/5/25 lloyd riggs >
>
> > There's also these, but 1 chip runs 6K US, they can get performance up to
> > 2.3 teraflops per chip though double percission...but have no clue about
> > integration with GPU's...Intell also sells their chips on PCIe
> cards...but
> > get only about 350 Gflops, and run 1K US$.
> >
> > http://en.wikipedia.org/wiki/Field-programmable_gate_array and vendor
> > http://www.xilinx.com/
> >
> > They can design them though to fit a PCIe slot and run about the same,
> but
> > still need the board, ram etc...
> >
> > Mostly just 

Aw: Re: Re: [gmx-users] GPU-based workstation

2013-05-25 Thread lloyd riggs
 

Id go for the i7 6 core,

 

To the other message, funny.  I bought ATI's as they clock faster and cost 1/3 the price of Nvidia's but then the software all went to Nvidia.  The new ATI with twice the shaders runs at the same speed (around 1-1.3 terflops ) due to the same problems the Nvidias ran into with IO (or maybe onboard RAM does solve the problem if they went up to 16 or 32 MB)  Gromacs, etc...doesn't run on ATI's, and I've been hoping they, AMD,  catch up, but all I ever see is the constant in 6 months then nothing.

 

I ran around 40 4 ns simulations on University blades with 8 AMD quad cores, using 3 blades I only was able to get 1 ns/day, but never pressed it as far as why so slow, as I needed to finish.  With the Nvidia at  even 5 ns/day I or alot of people could do some really nice work as far as publishing, with raw data in 2 weeks time, so now I feel a bit saddened...

 

I also just found openCL profilling with CUDA 5 that will take any C or C++ software, and mark all sections you need to convert to openCL, but the trial software is 30 day, then 250 US$...

 

Stephan


Gesendet: Samstag, 25. Mai 2013 um 15:19 Uhr
Von: "James Starlight" 
An: "Discussion list for GROMACS users" 
Betreff: Re: Aw: Re: [gmx-users] GPU-based workstation

Richard,

thanks for suggestion!

Assuming that I'm using 2 high end GeForce's what performance be better

1) in case of one i7 (4 or 6 nodes ) ?

2) in case of 8 core Xeon like CPU Intel Xeon E5-2650 2.0 GHz / 8core

What properties of MB should take into account primarily for such
Xenon-based system. Does such MBs support multi-GPU ( I noticed that many
such MBs lack for PCI)?

James

2013/5/25 Broadbent, Richard 

> I've been running on my Universities GPU nodes these are one E5-xeon
> (6-cores 12 threads) and have 4 Nvidia 690gtx's. My system is 93 000 atoms
> of DMF under NVE. The performance has been a little disappointing
> ~10ns/day. On my home system using a core i5-2500 and a nvidia 560ti I get
> 5.4ns/day for the same system. On our HPC system using 32 nodes each with 2
> quad-core xeon processors I get 30-40ns/day.
>
> I think that to achieve reasonable performance the system has to be
> balanced between CPU's and GPU's probably getting 2 high end GPU's and a
> top end xeon E5 or core i7 would be a good choice.
>
>
> Richard
>
> From: lloyd riggs >
> Reply-To: Discussion users 
> gmx-users@gromacs.org>>
> Date: Saturday, 25 May 2013 12:02
> To: Discussion users >
> Subject: Aw: Re: [gmx-users] GPU-based workstation
>
> More RAM the better, and the best I have seen is 4 GPU work station. I
> can use/have used 4. The GPU takes 2 slots though, so a 7-8 PCIe board is
> really 3-4 GPU, except the tyan mentioned (there designed as blades so an 8
> or 10 slot board really holds 8 or 10 GPU's). There's cooling problems
> though with GPU's, as on a board there packed, so extra cooling things may
> help not blow a GPU, but I would look for good ones (ask around), as its a
> video game market and they go for looks even though its in casing? The
> external RAM (not onboard GPU RAM) helps if you do a larger sim, but I dont
> know performance wise, the onboard GPU, the more RAM the marrier...so yes,
> normal work stations you can get 4 GPU's for a 300 US$ board, but then the
> price goes way up (3-4000 US$ for an 8-10 gpu board). RAM ordered abroad
> is also cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on tests
> software, not Gromacs, so would be nice to see performance...for a small
> 100 atom molecule and 500 solvent, using just the CPU I get it to run 5-10
> minutes real for 1 ns sim, but tried simple large 800 amino, 25,000
> solvent eq (NVT or NPT) runs and they clock at around 1 hour real for say
> 50 ps eq's
>
> Stephan
>
> Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
> Von: "James Starlight" 
> jmsstarli...@gmail.com>>
> An: "Discussion list for GROMACS users" 
> gmx-users@gromacs.org>>
> Betreff: Re: [gmx-users] GPU-based workstation
> Dear Dr. Watkins!
>
> Thank you for the suggestions!
>
> In the local shops I've found only Core i7 with 6 cores (like Core
> i7-39xx) and 4 cores. Should I obtain much better performance with 6 cores
> than with 4 cores in case of i7 cpu (assuming that I run simulation in
> cpu+gpu mode )?
>
> Also you've mentioned about 4 PCeI MD. Does it means that modern
> work-station could have 4 GPU's in one home-like desktop ? According to my
> current task I suppose that 2 GPU's would be suitable for my simulations
> (assuming that I use typical ASUS MB and 650 Watt power unit). Have
> someone tried to use several GPU's on one workstation ? What attrib

Re: Aw: Re: [gmx-users] GPU-based workstation

2013-05-25 Thread James Starlight
Richard,

thanks for suggestion!

Assuming that I'm using 2 high end GeForce's what performance be better

1) in case of one i7 (4 or 6 nodes ) ?

2) in case of 8 core Xeon like  CPU Intel Xeon E5-2650 2.0 GHz / 8core

What properties of MB should take into account primarily for such
Xenon-based system. Does such MBs support multi-GPU ( I noticed that many
such MBs lack for PCI)?

James

2013/5/25 Broadbent, Richard 

> I've been running on my Universities GPU nodes these are one E5-xeon
> (6-cores 12 threads)  and have 4 Nvidia 690gtx's. My system is 93 000 atoms
> of DMF under NVE.  The performance has been a little disappointing
> ~10ns/day. On my home system using a core i5-2500 and a nvidia 560ti I get
> 5.4ns/day for the same system. On our HPC system using 32 nodes each with 2
> quad-core xeon processors I get 30-40ns/day.
>
> I think that to achieve reasonable performance the system has to be
> balanced between CPU's and GPU's probably getting 2 high end GPU's and a
> top end xeon E5 or core i7 would be a good choice.
>
>
> Richard
>
> From: lloyd riggs mailto:lloyd.ri...@gmx.ch>>
> Reply-To: Discussion users  gmx-users@gromacs.org>>
> Date: Saturday, 25 May 2013 12:02
> To: Discussion users mailto:gmx-users@gromacs.org>>
> Subject: Aw: Re: [gmx-users] GPU-based workstation
>
> More RAM the better, and the best I have seen is 4 GPU work station.  I
> can use/have used 4.  The GPU takes 2 slots though, so a 7-8 PCIe board is
> really 3-4 GPU, except the tyan mentioned (there designed as blades so an 8
> or 10 slot board really holds 8 or 10 GPU's).  There's cooling problems
> though with GPU's, as on a board there packed, so extra cooling things may
> help not blow a GPU, but I would look for good ones (ask around), as its a
> video game market and they go for looks even though its in casing?  The
> external RAM (not onboard GPU RAM) helps if you do a larger sim, but I dont
> know performance wise, the onboard GPU, the more RAM the marrier...so yes,
> normal work stations you can get 4 GPU's for a 300 US$ board, but then the
> price goes way up (3-4000 US$ for an 8-10 gpu board).  RAM ordered abroad
> is also cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on tests
> software, not Gromacs, so would be nice to see performance...for a small
> 100 atom molecule and 500 solvent, using just the CPU I get it to run 5-10
> minutes real  for 1 ns sim, but tried simple large 800 amino, 25,000
> solvent eq (NVT or NPT) runs and they clock at around 1 hour real for say
> 50 ps eq's....
>
> Stephan
>
> Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
> Von: "James Starlight"  jmsstarli...@gmail.com>>
> An: "Discussion list for GROMACS users"  gmx-users@gromacs.org>>
> Betreff: Re: [gmx-users] GPU-based workstation
> Dear Dr. Watkins!
>
> Thank you for the suggestions!
>
> In the local shops I've found only Core i7 with 6 cores (like Core
> i7-39xx) and 4 cores. Should I obtain much better performance with 6 cores
> than with 4 cores in case of i7 cpu (assuming that I run simulation in
> cpu+gpu mode )?
>
> Also you've mentioned about 4 PCeI MD. Does it means that modern
> work-station could have 4 GPU's in one home-like desktop ? According to my
> current task I suppose that 2 GPU's would be suitable for my simulations
> (assuming that I use typical ASUS MB and 650 Watt power unit). Have
> someone tried to use several GPU's on one workstation ? What attributes of
> MB should be taken into account for best performance on such multi-gpu
> station ?
>
> James
>
> 2013/5/25 lloyd riggs mailto:lloyd.ri...@gmx.ch>>
>
> > There's also these, but 1 chip runs 6K US, they can get performance up to
> > 2.3 teraflops per chip though double percission...but have no clue about
> > integration with GPU's...Intell also sells their chips on PCIe
> cards...but
> > get only about 350 Gflops, and run 1K US$.
> >
> > http://en.wikipedia.org/wiki/Field-programmable_gate_array and vendor
> > http://www.xilinx.com/
> >
> > They can design them though to fit a PCIe slot and run about the same,
> but
> > still need the board, ram etc...
> >
> > Mostly just to dream about, they say you can order them with radiation
> > shielding as well...so...
> >
> > Stephan Watkins
> >
> > *Gesendet:* Freitag, 24. Mai 2013 um 13:17 Uhr
> > *Von:* "James Starlight"  jmsstarli...@gmail.com>>
> > *An:* "Discussion list for GROMACS users"  gmx-users@gromacs.org>>
> > *Betreff:* [gmx-users] GPU-based workstation
> > Dear Gromacs Users!
>

Re: Aw: Re: [gmx-users] GPU-based workstation

2013-05-25 Thread Broadbent, Richard
I've been running on my Universities GPU nodes these are one E5-xeon (6-cores 
12 threads)  and have 4 Nvidia 690gtx's. My system is 93 000 atoms of DMF under 
NVE.  The performance has been a little disappointing ~10ns/day. On my home 
system using a core i5-2500 and a nvidia 560ti I get 5.4ns/day for the same 
system. On our HPC system using 32 nodes each with 2 quad-core xeon processors 
I get 30-40ns/day.

I think that to achieve reasonable performance the system has to be balanced 
between CPU's and GPU's probably getting 2 high end GPU's and a top end xeon E5 
or core i7 would be a good choice.


Richard

From: lloyd riggs mailto:lloyd.ri...@gmx.ch>>
Reply-To: Discussion users mailto:gmx-users@gromacs.org>>
Date: Saturday, 25 May 2013 12:02
To: Discussion users mailto:gmx-users@gromacs.org>>
Subject: Aw: Re: [gmx-users] GPU-based workstation

More RAM the better, and the best I have seen is 4 GPU work station.  I can 
use/have used 4.  The GPU takes 2 slots though, so a 7-8 PCIe board is really 
3-4 GPU, except the tyan mentioned (there designed as blades so an 8 or 10 slot 
board really holds 8 or 10 GPU's).  There's cooling problems though with GPU's, 
as on a board there packed, so extra cooling things may help not blow a GPU, 
but I would look for good ones (ask around), as its a video game market and 
they go for looks even though its in casing?  The external RAM (not onboard GPU 
RAM) helps if you do a larger sim, but I dont know performance wise, the 
onboard GPU, the more RAM the marrier...so yes, normal work stations you can 
get 4 GPU's for a 300 US$ board, but then the price goes way up (3-4000 US$ for 
an 8-10 gpu board).  RAM ordered abroad is also cheep, 8 or 16 MB Vs. Shop...I 
have used 4 GPU's but only on tests software, not Gromacs, so would be nice to 
see performance...for a small 100 atom molecule and 500 solvent, using just the 
CPU I get it to run 5-10 minutes real  for 1 ns sim, but tried simple large 800 
amino, 25,000 solvent eq (NVT or NPT) runs and they clock at around 1 hour real 
for say 50 ps eq's

Stephan

Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
Von: "James Starlight" mailto:jmsstarli...@gmail.com>>
An: "Discussion list for GROMACS users" 
mailto:gmx-users@gromacs.org>>
Betreff: Re: [gmx-users] GPU-based workstation
Dear Dr. Watkins!

Thank you for the suggestions!

In the local shops I've found only Core i7 with 6 cores (like Core
i7-39xx) and 4 cores. Should I obtain much better performance with 6 cores
than with 4 cores in case of i7 cpu (assuming that I run simulation in
cpu+gpu mode )?

Also you've mentioned about 4 PCeI MD. Does it means that modern
work-station could have 4 GPU's in one home-like desktop ? According to my
current task I suppose that 2 GPU's would be suitable for my simulations
(assuming that I use typical ASUS MB and 650 Watt power unit). Have
someone tried to use several GPU's on one workstation ? What attributes of
MB should be taken into account for best performance on such multi-gpu
station ?

James

2013/5/25 lloyd riggs mailto:lloyd.ri...@gmx.ch>>

> There's also these, but 1 chip runs 6K US, they can get performance up to
> 2.3 teraflops per chip though double percission...but have no clue about
> integration with GPU's...Intell also sells their chips on PCIe cards...but
> get only about 350 Gflops, and run 1K US$.
>
> http://en.wikipedia.org/wiki/Field-programmable_gate_array and vendor
> http://www.xilinx.com/
>
> They can design them though to fit a PCIe slot and run about the same, but
> still need the board, ram etc...
>
> Mostly just to dream about, they say you can order them with radiation
> shielding as well...so...
>
> Stephan Watkins
>
> *Gesendet:* Freitag, 24. Mai 2013 um 13:17 Uhr
> *Von:* "James Starlight" 
> mailto:jmsstarli...@gmail.com>>
> *An:* "Discussion list for GROMACS users" 
> mailto:gmx-users@gromacs.org>>
> *Betreff:* [gmx-users] GPU-based workstation
> Dear Gromacs Users!
>
>
> I'd like to build new workstation for performing simulation on GPU with
> Gromacs 4.6 native cuda support.
> Recently I've used such setup with Core i5 cpu and nvidia 670 GTX video
> and obtain good performance ( ~ 20 ns\day for typical 60.000 atom system
> with SD integrator)
>
>
> Now I'd like to build multi-gpu wokstation.
>
> My question - How much GPU would give me best performance on the typical
> home-like workstation. What algorithm of Ncidia GPU integration should I
> use (e.g SLI etc) ?
>
>
> Thanks for help,
>
>
> James
> --
> gmx-users mailing list gmx-users@gromacs.org<mailto:gmx-users@gromacs.org>
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archiv

Aw: Re: [gmx-users] GPU-based workstation

2013-05-25 Thread lloyd riggs

More RAM the better, and the best I have seen is 4 GPU work station.  I can use/have used 4.  The GPU takes 2 slots though, so a 7-8 PCIe board is really 3-4 GPU, except the tyan mentioned (there designed as blades so an 8 or 10 slot board really holds 8 or 10 GPU's).  There's cooling problems though with GPU's, as on a board there packed, so extra cooling things may help not blow a GPU, but I would look for good ones (ask around), as its a video game market and they go for looks even though its in casing?  The external RAM (not onboard GPU RAM) helps if you do a larger sim, but I dont know performance wise, the onboard GPU, the more RAM the marrier...so yes, normal work stations you can get 4 GPU's for a 300 US$ board, but then the price goes way up (3-4000 US$ for an 8-10 gpu board).  RAM ordered abroad is also cheep, 8 or 16 MB Vs. Shop...I have used 4 GPU's but only on tests software, not Gromacs, so would be nice to see performance...for a small 100 atom molecule and 500 solvent, using just the CPU I get it to run 5-10 minutes real  for 1 ns sim, but tried simple large 800 amino, 25,000 solvent eq (NVT or NPT) runs and they clock at around 1 hour real for say 50 ps eq's

 

Stephan

 

Gesendet: Samstag, 25. Mai 2013 um 07:54 Uhr
Von: "James Starlight" 
An: "Discussion list for GROMACS users" 
Betreff: Re: [gmx-users] GPU-based workstation

Dear Dr. Watkins!

Thank you for the suggestions!

In the local shops I've found only Core i7 with 6 cores (like Core
i7-39xx) and 4 cores. Should I obtain much better performance with 6 cores
than with 4 cores in case of i7 cpu (assuming that I run simulation in
cpu+gpu mode )?

Also you've mentioned about 4 PCeI MD. Does it means that modern
work-station could have 4 GPU's in one home-like desktop ? According to my
current task I suppose that 2 GPU's would be suitable for my simulations
(assuming that I use typical ASUS MB and 650 Watt power unit). Have
someone tried to use several GPU's on one workstation ? What attributes of
MB should be taken into account for best performance on such multi-gpu
station ?

James

2013/5/25 lloyd riggs 

> There's also these, but 1 chip runs 6K US, they can get performance up to
> 2.3 teraflops per chip though double percission...but have no clue about
> integration with GPU's...Intell also sells their chips on PCIe cards...but
> get only about 350 Gflops, and run 1K US$.
>
> http://en.wikipedia.org/wiki/Field-programmable_gate_array and vendor
> http://www.xilinx.com/
>
> They can design them though to fit a PCIe slot and run about the same, but
> still need the board, ram etc...
>
> Mostly just to dream about, they say you can order them with radiation
> shielding as well...so...
>
> Stephan Watkins
>
> *Gesendet:* Freitag, 24. Mai 2013 um 13:17 Uhr
> *Von:* "James Starlight" 
> *An:* "Discussion list for GROMACS users" 
> *Betreff:* [gmx-users] GPU-based workstation
> Dear Gromacs Users!
>
>
> I'd like to build new workstation for performing simulation on GPU with
> Gromacs 4.6 native cuda support.
> Recently I've used such setup with Core i5 cpu and nvidia 670 GTX video
> and obtain good performance ( ~ 20 ns\day for typical 60.000 atom system
> with SD integrator)
>
>
> Now I'd like to build multi-gpu wokstation.
>
> My question - How much GPU would give me best performance on the typical
> home-like workstation. What algorithm of Ncidia GPU integration should I
> use (e.g SLI etc) ?
>
>
> Thanks for help,
>
>
> James
> --
> gmx-users mailing list gmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> --
> gmx-users mailing list gmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
--
gmx-users mailing list gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists



Re: [gmx-users] GPU-based workstation

2013-05-25 Thread James Starlight
Dear Dr. Watkins!

Thank you for the suggestions!

In the local shops I've found only Core i7 with 6 cores (like  Core
i7-39xx) and 4 cores.  Should I obtain much better performance with 6 cores
than with 4 cores in case of i7 cpu (assuming that I run simulation in
cpu+gpu mode )?

Also you've mentioned about 4 PCeI MD. Does it means that modern
work-station could have 4 GPU's in one home-like desktop ? According to my
current task I suppose that 2 GPU's would be suitable for my simulations
(assuming that I use typical ASUS MB  and 650 Watt power unit). Have
someone tried to use several GPU's on one workstation ? What attributes of
MB should be taken into account for best performance on such multi-gpu
station ?

James

2013/5/25 lloyd riggs 

> There's also these, but 1 chip runs 6K US, they can get performance up to
> 2.3 teraflops per chip though double percission...but have no clue about
> integration with GPU's...Intell also sells their chips on PCIe cards...but
> get only about 350 Gflops, and run 1K US$.
>
> http://en.wikipedia.org/wiki/Field-programmable_gate_array and vendor
> http://www.xilinx.com/
>
> They can design them though to fit a PCIe slot and run about the same, but
> still need the board, ram etc...
>
> Mostly just to dream about, they say you can order them with radiation
> shielding as well...so...
>
> Stephan Watkins
>
> *Gesendet:* Freitag, 24. Mai 2013 um 13:17 Uhr
> *Von:* "James Starlight" 
> *An:* "Discussion list for GROMACS users" 
> *Betreff:* [gmx-users] GPU-based workstation
> Dear Gromacs Users!
>
>
> I'd like to build new workstation for performing simulation on GPU with
> Gromacs 4.6 native cuda support.
> Recently I've used such setup with Core i5 cpu and nvidia 670 GTX video
> and obtain good performance ( ~ 20 ns\day for typical 60.000 atom system
> with SD integrator)
>
>
> Now I'd like to build multi-gpu wokstation.
>
> My question - How much GPU would give me best performance on the typical
> home-like workstation. What algorithm of Ncidia GPU integration should I
> use (e.g SLI etc) ?
>
>
> Thanks for help,
>
>
> James
> --
> gmx-users mailing list gmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-05-02 Thread Albert

the problem is still there...

:-(



On 04/29/2013 06:06 PM, Szilárd Páll wrote:

On Mon, Apr 29, 2013 at 3:51 PM, Albert  wrote:

>On 04/29/2013 03:47 PM, Szilárd Páll wrote:

>>
>>In that case, while it isn't very likely, the issue could be caused by
>>some implementation detail which aims to avoid performance loss caused
>>by an issue in the NVIDIA drivers.
>>
>>Try running with the GMX_CUDA_STREAMSYNC environment variable set.
>>
>>Btw, were there any other processes using the GPU while mdrun was running?
>>
>>Cheers,
>>--
>>Szilárd

>
>
>thanks for kind reply.
>There is no any other process when I am running Gromacs.
>
>do you mean I should set GMX_CUDA_STREAMSYNC in the job script like:
>
>export GMX_CUDA_STREAMSYNC=/opt/cuda-5.0

Sort of, but the value does not matter. So if your shell is bash, the
above as well as simply "export GMX_CUDA_STREAMSYNC=" will work fine.

Let us know if this avoided the crash - when you have simulated long
enough to be able to judge.

Cheers,
--
Szilárd



--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll
On Mon, Apr 29, 2013 at 3:51 PM, Albert  wrote:
> On 04/29/2013 03:47 PM, Szilárd Páll wrote:
>>
>> In that case, while it isn't very likely, the issue could be caused by
>> some implementation detail which aims to avoid performance loss caused
>> by an issue in the NVIDIA drivers.
>>
>> Try running with the GMX_CUDA_STREAMSYNC environment variable set.
>>
>> Btw, were there any other processes using the GPU while mdrun was running?
>>
>> Cheers,
>> --
>> Szilárd
>
>
> thanks for kind reply.
> There is no any other process when I am running Gromacs.
>
> do you mean I should set GMX_CUDA_STREAMSYNC in the job script like:
>
> export GMX_CUDA_STREAMSYNC=/opt/cuda-5.0

Sort of, but the value does not matter. So if your shell is bash, the
above as well as simply "export GMX_CUDA_STREAMSYNC=" will work fine.

Let us know if this avoided the crash - when you have simulated long
enough to be able to judge.

Cheers,
--
Szilárd

>
> ?
>
> THX
> Albert
>
>
>
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Albert

On 04/29/2013 03:47 PM, Szilárd Páll wrote:

In that case, while it isn't very likely, the issue could be caused by
some implementation detail which aims to avoid performance loss caused
by an issue in the NVIDIA drivers.

Try running with the GMX_CUDA_STREAMSYNC environment variable set.

Btw, were there any other processes using the GPU while mdrun was running?

Cheers,
--
Szilárd


thanks for kind reply.
There is no any other process when I am running Gromacs.

do you mean I should set GMX_CUDA_STREAMSYNC in the job script like:

export GMX_CUDA_STREAMSYNC=/opt/cuda-5.0

?

THX
Albert



--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll
In that case, while it isn't very likely, the issue could be caused by
some implementation detail which aims to avoid performance loss caused
by an issue in the NVIDIA drivers.

Try running with the GMX_CUDA_STREAMSYNC environment variable set.

Btw, were there any other processes using the GPU while mdrun was running?

Cheers,
--
Szilárd


On Mon, Apr 29, 2013 at 3:32 PM, Albert  wrote:
> On 04/29/2013 03:31 PM, Szilárd Páll wrote:
>>
>> The segv indicates that mdrun crashed and not that the machine was
>> restarted. The GPU detection output (both on stderr and log) should
>> show whether ECC is "on" (and so does the nvidia-smi tool).
>>
>> Cheers,
>> --
>> Szilárd
>
>
> yes it was on:
>
>
> Reading file heavy.tpr, VERSION 4.6.1 (single precision)
> Using 4 MPI threads
> Using 8 OpenMP threads per tMPI thread
>
> 5 GPUs detected:
>   #0: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
>   #1: NVIDIA GeForce GTX 650, compute cap.: 3.0, ECC:  no, stat: compatible
>   #2: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
>   #3: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
>   #4: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
>
> 4 GPUs user-selected for this run: #0, #2, #3, #4
>
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Albert

On 04/29/2013 03:31 PM, Szilárd Páll wrote:

The segv indicates that mdrun crashed and not that the machine was
restarted. The GPU detection output (both on stderr and log) should
show whether ECC is "on" (and so does the nvidia-smi tool).

Cheers,
--
Szilárd


yes it was on:


Reading file heavy.tpr, VERSION 4.6.1 (single precision)
Using 4 MPI threads
Using 8 OpenMP threads per tMPI thread

5 GPUs detected:
  #0: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
  #1: NVIDIA GeForce GTX 650, compute cap.: 3.0, ECC:  no, stat: compatible
  #2: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
  #3: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible
  #4: NVIDIA Tesla K20m, compute cap.: 3.5, ECC: yes, stat: compatible

4 GPUs user-selected for this run: #0, #2, #3, #4

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll
On Mon, Apr 29, 2013 at 2:41 PM, Albert  wrote:
> On 04/28/2013 05:45 PM, Justin Lemkul wrote:
>>
>>
>> Frequent failures suggest instability in the simulated system. Check your
>> .log file or stderr for informative Gromacs diagnostic information.
>>
>> -Justin
>
>
>
> my log file didn't have any errors, the end of topped log file something
> like:
>
> DD  step 2259  vol min/aver 0.967  load imb.: force  0.8%
>
>Step   Time Lambda
>226045200.00.0
>
>Energies (kJ/mol)
>   AngleU-BProper Dih.  Improper Dih.  LJ-14
> 9.86437e+034.02406e+043.52809e+046.13542e+02 8.61815e+03
>  Coulomb-14LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
> 1.25055e+043.05477e+04   -9.05956e+03   -6.02400e+05 1.58357e+03
>  Position Rest.  PotentialKinetic En.   Total Energy Temperature
> 1.39149e+02   -4.72066e+051.37165e+05   -3.34901e+05 3.11958e+02
>  Pres. DC (bar) Pressure (bar)   Constr. rmsd
>-2.94092e+02   -7.91535e+011.79812e-05
>
>
> also in the information file I only obtained information:
>
>
> step 13300, will finish Tue Apr 30 14:41
> NOTE: Turning on dynamic load balancing
>
>
> Probably the machine was restarted from time to time?

The segv indicates that mdrun crashed and not that the machine was
restarted. The GPU detection output (both on stderr and log) should
show whether ECC is "on" (and so does the nvidia-smi tool).

Cheers,
--
Szilárd


>
> best
> Albert
>
>
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Albert

On 04/28/2013 05:45 PM, Justin Lemkul wrote:


Frequent failures suggest instability in the simulated system. Check 
your .log file or stderr for informative Gromacs diagnostic information.


-Justin 



my log file didn't have any errors, the end of topped log file something 
like:


DD  step 2259  vol min/aver 0.967  load imb.: force  0.8%

   Step   Time Lambda
   226045200.00.0

   Energies (kJ/mol)
  AngleU-BProper Dih.  Improper Dih.  LJ-14
9.86437e+034.02406e+043.52809e+046.13542e+02 8.61815e+03
 Coulomb-14LJ (SR)  Disper. corr.   Coulomb (SR)   Coul. recip.
1.25055e+043.05477e+04   -9.05956e+03   -6.02400e+05 1.58357e+03
 Position Rest.  PotentialKinetic En.   Total Energy Temperature
1.39149e+02   -4.72066e+051.37165e+05   -3.34901e+05 3.11958e+02
 Pres. DC (bar) Pressure (bar)   Constr. rmsd
   -2.94092e+02   -7.91535e+011.79812e-05


also in the information file I only obtained information:


step 13300, will finish Tue Apr 30 14:41
NOTE: Turning on dynamic load balancing


Probably the machine was restarted from time to time?

best
Albert


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Albert

Hello:

 yes, I tried the CPU only version, it goes well and didn't stop. I am 
not sure whether I have ECC on or not. There are 4 Tesla K20 and one 
GTX650 in the workstation, after compilation, I simple submit the jobs 
with command:



mdrun -s md.tpr -gpu_id 0234

I submit the same system in another GTX690 machine, it also goes 
well. I compiled Gromacs with the same options in that machine.


thank you very much
best
Albert



On 04/29/2013 01:19 PM, Szilárd Páll wrote:

Have you tried running on CPUs only just to see if the issue persists?
Unless the issue does not occur with the same binary on the same
hardware running on CPUs only, I doubt it's a problem in the code.

Do you have ECC on?
--
Szilárd


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-04-29 Thread Szilárd Páll
Have you tried running on CPUs only just to see if the issue persists?
Unless the issue does not occur with the same binary on the same
hardware running on CPUs only, I doubt it's a problem in the code.

Do you have ECC on?
--
Szilárd


On Sun, Apr 28, 2013 at 5:27 PM, Albert  wrote:
> Dear:
>
>   I am running MD jobs in a workstation with 4 K20 GPU and I found that the
> job always failed with following messages from time to time:
>
>
> [tesla:03432] *** Process received signal ***
> [tesla:03432] Signal: Segmentation fault (11)
> [tesla:03432] Signal code: Address not mapped (1)
> [tesla:03432] Failing at address: 0xfffe02de67e0
> [tesla:03432] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)
> [0x7f4666da1cb0]
> [tesla:03432] [ 1] mdrun_mpi() [0x47dd61]
> [tesla:03432] [ 2] mdrun_mpi() [0x47d8ae]
> [tesla:03432] [ 3]
> /opt/intel/lib/intel64/libiomp5.so(__kmp_invoke_microtask+0x93)
> [0x7f46667904f3]
> [tesla:03432] *** End of error message ***
> --
> mpirun noticed that process rank 0 with PID 3432 on node tesla exited on
> signal 11 (Segmentation fault).
> --
>
>
> I can continue the jobs with mdrun option "-append -cpi", but it still
> stopped from time to time. I am just wondering what's the problem?
>
> thank you very much
> Albert
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU job often stopped

2013-04-28 Thread Justin Lemkul



On 4/28/13 11:27 AM, Albert wrote:

Dear:

   I am running MD jobs in a workstation with 4 K20 GPU and I found that the job
always failed with following messages from time to time:


[tesla:03432] *** Process received signal ***
[tesla:03432] Signal: Segmentation fault (11)
[tesla:03432] Signal code: Address not mapped (1)
[tesla:03432] Failing at address: 0xfffe02de67e0
[tesla:03432] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0) 
[0x7f4666da1cb0]
[tesla:03432] [ 1] mdrun_mpi() [0x47dd61]
[tesla:03432] [ 2] mdrun_mpi() [0x47d8ae]
[tesla:03432] [ 3]
/opt/intel/lib/intel64/libiomp5.so(__kmp_invoke_microtask+0x93) [0x7f46667904f3]
[tesla:03432] *** End of error message ***
--
mpirun noticed that process rank 0 with PID 3432 on node tesla exited on signal
11 (Segmentation fault).
--


I can continue the jobs with mdrun option "-append -cpi", but it still stopped
from time to time. I am just wondering what's the problem?



Frequent failures suggest instability in the simulated system.  Check your .log 
file or stderr for informative Gromacs diagnostic information.


-Justin

--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU efficiency question

2013-04-27 Thread Mark Abraham
Probably the part of the calculation done on the GPU is not rate limiting.
There's no point having four chefs to make one dish...

Look at the beginning and end of your .log files for diagnostic
information. If this is a single node, you should be using threadMPI, not
real MPI. Generally four CPU cores vs four GPU cores will require an
extremely large PP load for the GPUs to all be effective.

Mark


On Fri, Apr 26, 2013 at 8:35 PM, Albert  wrote:

> Dear:
>
>  I've got two GTX690 in a a workstation and I found that when I run the md
> production with following two command:
>
> mpirun -np 4 md_run_mpi
>
> or
>
> mpirun -np 2 md_run_mpi
>
> the efficiency are the same. I notice that gromacs can detect 4 GPU
> (probably because GTX690 have two core..):
>
> 4 GPUs detected on host node4:
>   #0: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
>   #1: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
>   #2: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
>   #3: NVIDIA GeForce GTX 690, compute cap.: 3.0, ECC:  no, stat: compatible
>
>
> why the "-np 2" and "-np 4" are the same efficiency? shouldn't it be
> faster for "-np 4" ?
>
> thank you very much
>
> Albert
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/**mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Searchbefore
>  posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read 
> http://www.gromacs.org/**Support/Mailing_Lists
>
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU performance

2013-04-10 Thread Szilárd Páll
On Wed, Apr 10, 2013 at 3:34 AM, Benjamin Bobay  wrote:

> Szilárd -
>
> First, many thanks for the reply.
>
> Second, I am glad that I am not crazy.
>
> Ok so based on your suggestions, I think I know what the problem is/was.
> There was a sander process running on 1 of the CPUs.  Clearly GROMACS was
> trying to use 4 with "Using 4 OpenMP thread". I just did not catch that.
> Sorry! Rookie mistake.
>
> Which I guess leads me to my next question (sorry if its too naive):
>
> (1) When running GROMACS (or a I guess any other CUDA based programs), its
> best to have all the CPUs free, right? I guess based on my results I have
> pretty much answered that question.  Although I thought that as long as I
> have one CPU available to run the GPU it would be good: would setting
> "-ntmpi 1 -ntomp 1" help or would I take a major hit in ns/day as well?
>

Such a behavior is not specific to GROMACS or CUDA-accelerated codes, but
all compute-intensive codes that expect to be running "alone" on the set of
CPU cores they are started on. As you could see on the output, mdrun
automatically detected that you have 4 CPU cores and as Mark saied, it
tries to use all of them along the GPU. As one of the cores was busy, you
ended up in a situation in which four threads of mdrun plus the
(presumably) one thread of sander are competing for four cores. This is
made even worse by the fact that when using a full machine, mdrun locks its
threads to physical cores to prevent the OS from moving them around (which
can cause performance loss).

Secondly, using a single core with a GPU will not result in a very good
performance in GROMACS. The current GROMACS acceleration expects to run on
a couple of CPU cores together with a GPU - which is the typical balance of
CPU-GPU hardware most clusters (1 GPU/socket) as well as many home users
would have (1-2 GPUs for 4-8 CPU cores).


>
> If I try the benchmarks again just to see (for fun) with "Using 4 OpenMP
> thread", under top I have - so I think the CPU is fine :
> PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 24791 bobayb20   0 48.3g  51m 7576 R 299.1  0.2  11:32.90
> mdrun
>

Nope, that just means, roughly speaking, that sander is probably fully
using one core and the four thread of mdrun are "crammed" on the remaining
three cores - which is bad.

However, you can simply run mdrun using three threads which will run fine
along sander. Whether this will be efficient or not, you'll have to see.
Note that if some other program is using the GPU as well, don't expect full
performance - but the difference will be much less than in the case
of oversubscribed CPU cores.

Cheers,
--
Szilárd


>
> When I have a chance (after this sander run is done - hopefully soon) I can
> try the benchmarks again.
>
> Thanks again for the help!
>
> Ben
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU performance

2013-04-09 Thread Mark Abraham
On Apr 10, 2013 3:34 AM, "Benjamin Bobay"  wrote:
>
> Szilárd -
>
> First, many thanks for the reply.
>
> Second, I am glad that I am not crazy.
>
> Ok so based on your suggestions, I think I know what the problem is/was.
> There was a sander process running on 1 of the CPUs.  Clearly GROMACS was
> trying to use 4 with "Using 4 OpenMP thread". I just did not catch that.
> Sorry! Rookie mistake.
>
> Which I guess leads me to my next question (sorry if its too naive):
>
> (1) When running GROMACS (or a I guess any other CUDA based programs), its
> best to have all the CPUs free, right? I guess based on my results I have
> pretty much answered that question.  Although I thought that as long as I
> have one CPU available to run the GPU it would be good: would setting
> "-ntmpi 1 -ntomp 1" help or would I take a major hit in ns/day as well?

Some codes might treat the CPU as a "I/O, MPI and memory-serving
co-processor" of the GPU; those codes will tend to be insensitive to the
CPU config. GROMACS goes to great lengths to use all the hardware in a
dynamically load-balanced way, so CPU load and config tend to affect the
bottom line immediately.

Mark

> If I try the benchmarks again just to see (for fun) with "Using 4 OpenMP
> thread", under top I have - so I think the CPU is fine :
> PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> 24791 bobayb20   0 48.3g  51m 7576 R 299.1  0.2  11:32.90
> mdrun
>
>
> When I have a chance (after this sander run is done - hopefully soon) I
can
> try the benchmarks again.
>
> Thanks again for the help!
>
> Ben
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU performance

2013-04-09 Thread Benjamin Bobay
Szilárd -

First, many thanks for the reply.

Second, I am glad that I am not crazy.

Ok so based on your suggestions, I think I know what the problem is/was.
There was a sander process running on 1 of the CPUs.  Clearly GROMACS was
trying to use 4 with "Using 4 OpenMP thread". I just did not catch that.
Sorry! Rookie mistake.

Which I guess leads me to my next question (sorry if its too naive):

(1) When running GROMACS (or a I guess any other CUDA based programs), its
best to have all the CPUs free, right? I guess based on my results I have
pretty much answered that question.  Although I thought that as long as I
have one CPU available to run the GPU it would be good: would setting
"-ntmpi 1 -ntomp 1" help or would I take a major hit in ns/day as well?

If I try the benchmarks again just to see (for fun) with "Using 4 OpenMP
thread", under top I have - so I think the CPU is fine :
PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
24791 bobayb20   0 48.3g  51m 7576 R 299.1  0.2  11:32.90
mdrun


When I have a chance (after this sander run is done - hopefully soon) I can
try the benchmarks again.

Thanks again for the help!

Ben
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU performance

2013-04-09 Thread Szilárd Páll
Hi Ben,

That performance is not reasonable at all - neither for CPU only run on
your quad-core Sandy Bridge, nor for the CPU+GPU run. For the latter you
should be getting more like 50 ns/day or so.

What's strange about your run is that the CPU-GPU load balancing is picking
a *very* long cut-off which means that your CPU is for some reason
performing very badly. Check how is mdrun behaving while running in
top/htop nad if you are not seeing ~400% CPU utilization, there is
something wrong - perhaps threads getting locked to the same core (to check
that try -pin off).

Secondly, note that you are using OpenMM-specific settings from the old
GROMACS-OpenMM comparison benchmarks in which the grid spacing is overly
coarse (you could use something like a fourier-spacing=0.125 or even larger
with rc=1.0).

Cheers,

--
Szilárd


On Tue, Apr 9, 2013 at 10:27 PM, Benjamin Bobay  wrote:

> Good afternoon -
>
> I recently installed gromacs-4.6 on CentOS6.3 and the installation went
> just fine.
>
> I have a Tesla C2075 GPU.
>
> I then downloaded the benchmark directories and ran a bench mark on the
> GPU/ dhfr-solv-PME.bench
>
> This is what I got:
>
> Using 1 MPI thread
> Using 4 OpenMP threads
>
> 1 GPU detected:
>   #0: NVIDIA Tesla C2075, compute cap.: 2.0, ECC: yes, stat: compatible
>
> 1 GPU user-selected for this run: #0
>
>
> Back Off! I just backed up ener.edr to ./#ener.edr.1#
> starting mdrun 'Protein in water'
> -1 steps, infinite ps.
> step   40: timed with pme grid 64 64 64, coulomb cutoff 1.000: 4122.9
> M-cycles
> step   80: timed with pme grid 56 56 56, coulomb cutoff 1.143: 3685.9
> M-cycles
> step  120: timed with pme grid 48 48 48, coulomb cutoff 1.333: 3110.8
> M-cycles
> step  160: timed with pme grid 44 44 44, coulomb cutoff 1.455: 3365.1
> M-cycles
> step  200: timed with pme grid 40 40 40, coulomb cutoff 1.600: 3499.0
> M-cycles
> step  240: timed with pme grid 52 52 52, coulomb cutoff 1.231: 3982.2
> M-cycles
> step  280: timed with pme grid 48 48 48, coulomb cutoff 1.333: 3129.2
> M-cycles
> step  320: timed with pme grid 44 44 44, coulomb cutoff 1.455: 3425.4
> M-cycles
> step  360: timed with pme grid 42 42 42, coulomb cutoff 1.524: 2979.1
> M-cycles
>   optimal pme grid 42 42 42, coulomb cutoff 1.524
> step 4300 performance: 1.8 ns/day
>
> and from the nvidia-smi output:
> Tue Apr  9 10:13:46 2013
> +--+
>
> | NVIDIA-SMI 4.304.37   Driver Version: 304.37
> |
>
> |---+--+--+
> | GPU  Name | Bus-IdDisp.  | Volatile Uncorr.
> ECC |
> | Fan  Temp  Perf  Pwr:Usage/Cap| Memory-Usage | GPU-Util  Compute
> M. |
>
> |===+==+==|
> |   0  Tesla C2075  | :03:00.0  On |
> 0 |
> | 30%   67CP080W / 225W |   4%  200MB / 5375MB |  4%
> Default |
>
> +---+--+--+
>
>
>
> +-+
> | Compute processes:   GPU
> Memory |
> |  GPU   PID  Process name
> Usage  |
>
> |=|
> |0 22568  mdrun
> 59MB  |
>
> +-+
>
>
> So I am only getting 1.8ns/day ! Is that right? It seems very very
> small compared to the CPU test where I am getting the same:
>
> step 200 performance: 1.8 ns/dayvol 0.79  imb F 14%
>
> >From the md.log of the GPU test:
> Detecting CPU-specific acceleration.
> Present hardware specification:
> Vendor: GenuineIntel
> Brand:  Intel(R) Xeon(R) CPU E5-2603 0 @ 1.80GHz
> Family:  6  Model: 45  Stepping:  7
> Features: aes apic avx clfsh cmov cx8 cx16 htt lahf_lm mmx msr nonstop_tsc
> pcid pclmuldq pdcm pdpe1gb popcnt pse rdtscp sse2 sse3 sse4.1 sse4.2 ssse3
> tdt x2a
> pic
> Acceleration most likely to fit this hardware: AVX_256
> Acceleration selected at GROMACS compile time: AVX_256
>
>
> 1 GPU detected:
>   #0: NVIDIA Tesla C2075, compute cap.: 2.0, ECC: yes, stat: compatible
>
> 1 GPU user-selected for this run: #0
>
> Will do PME sum in reciprocal space.
>
> Any thoughts as to why it is so slow?
>
> many thanks!
> Ben
>
> --
> 
> Research Assistant Professor
> North Carolina State University
> Department of Molecular and Structural Biochemistry
> 128 Polk Hall
> Raleigh, NC 27695
> Phone: (919)-513-0698
> Fax: (919)-515-2047
> 
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the

Re: [gmx-users] GPU version of GROMACS 4.6 in MacOS cluster

2013-03-08 Thread George Patargias
Hi Szilard

Thanks for this tip; it was extremely useful. The problem was indeed the
incompatibility between the installed NVIDIA driver and the CUDA 5.0
runtime library. Installation of an older driver solved the problem. The
programs devideQuery etc can now detect the GPU.

GROMACS can also detect now the card but unfortunately aborts with the
following error

Fatal error: Incorrect launch configuration: mismatching number of PP MPI
processes and GPUs per node.
mdrun_mpi was started with 12 PP MPI processes per node, but only 1 GPU
were detected.

Here is my command line
mpirun -np 12 mdrun_mpi -s test.tpr -deffnm test_out -nb gpu

What can be the problem?

Thanks again

> Hi George,
> As I said before, that just means that most probably the GPU driver is
not
> compatible with the CUDA runtime (libcudart) that you installed with the
CUDA toolkit. I've no clue about the Mac OS installers and releases,
you'll
> have to do the research on that. Let us know if you have further
(GROMACS-related) issues.
> Cheers,
> --
> Szil?rd
> On Fri, Mar 1, 2013 at 2:48 PM, George Patargias 
wrote:
>> Hi Szilαrd
>> Thanks for your reply. I have run the deviceQuery utility and what I
got
>> back is
>> /deviceQuery Starting...
>>  CUDA Device Query (Runtime API) version (CUDART static linking)
>> cudaGetDeviceCount returned 38
>> -> no CUDA-capable device is detected
>> Should I understand from this that the CUDA driver was not installed from
>> the MAC OS  X CUDA 5.0 Production Release?
>> George
>> > HI,
>> > That looks like the driver does not work or is incompatible with the
runtime. Please get the SDK, compile a simple program, e.g.
>> deviceQuery
>> > and
>> > see if that works (I suspect that it won't).
>> > Regarding your machines, just FYI, the Quadro 4000 is a pretty slow
>> card
>> > (somewhat slower than a GTX 460) so you'll hava a quite strong
>> resource
>> > imbalance: a lot of CPU compute power (2x Xeon 5xxx, right?) and
>> little
>> > GPU
>> > compute power which will lead to the CPU idling while waiting for the
>> GPU.
>> > Cheers,
>> > --
>> > Szilαrd
>> > On Thu, Feb 28, 2013 at 4:52 PM, George Patargias
>> > wrote:
>> >> Hello
>> >> We are trying to install the GPU version of GROMACS 4.6 on our own
MacOS cluster. So for the cluster nodes that have the NVIDIA Quadro
>> 4000
>> >> cards:
>> >> - We have downloaded and install the MAC OS  X CUDA 5.0 Production
Release
>> >> from here: https://developer.nvidia.com/cuda-downloads
>> >> placing the libraries contained in this download in
>> /usr/local/cuda/lib
>> >> - We have managed to compile GROMACS 4.6 linking it statically with
these
>> >> CUDA libraries and the MPI libraries (with BUILD_SHARED_LIBS=OFF and
GMX_PREFER_STATIC_LIBS=ON)
>> >> Unfortunately, when we tried to run a test job with the generated
mdrun_mpi, GROMACS reported that it cannot detect any CUDA-enabled
devices. It also reports 0.0 version for CUDA driver and runtime. Is
the actual CUDA driver missing from the MAC OS  X CUDA 5.0
>> Production
>> >> Release that we installed? Do we need to install it from here:
http://www.nvidia.com/object/cuda-mac-driver.html
>> >> Or is something else that we need to do?
>> >> Many thanks in advance.
>> >> George
>> >> Dr. George Patargias
>> >> Postdoctoral Researcher
>> >> Biomedical Research Foundation
>> >> Academy of Athens
>> >> 4, Soranou Ephessiou
>> >> 115 27
>> >> Athens
>> >> Greece
>> >> Office: +302106597568
>> >> --
>> >> gmx-users mailing listgmx-users@gromacs.org
>> >> http://lists.gromacs.org/mailman/listinfo/gmx-users
>> >> * Please search the archive at
>> >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the www
interface or send it to gmx-users-requ...@gromacs.org.
>> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> > --
>> > gmx-users mailing listgmx-users@gromacs.org
>> > http://lists.gromacs.org/mailman/listinfo/gmx-users
>> > * Please search the archive at
>> > http://www.gromacs.org/Support/Mailing_Lists/Search before posting! *
Please don't post (un)subscribe requests to the list. Use the www
interface or send it to gmx-users-requ...@gromacs.org.
>> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> Dr. George Patargias
>> Postdoctoral Researcher
>> Biomedical Research Foundation
>> Academy of Athens
>> 4, Soranou Ephessiou
>> 115 27
>> Athens
>> Greece
>> Office: +302106597568
>> --
>> gmx-users mailing listgmx-users@gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/Search before posting! *
Please don't post (un)subscribe requests to the list. Use the www
interface or send it to gmx-users-requ...@gromacs.org.
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search

Re: [gmx-users] GPU version of GROMACS 4.6 in MacOS cluster

2013-03-01 Thread Albert

The easiest way for solution is to kill MacOS ans switch to Linux.

;-)

Albert


On 03/01/2013 06:03 PM, Szilárd Páll wrote:

Hi George,

As I said before, that just means that most probably the GPU driver is not
compatible with the CUDA runtime (libcudart) that you installed with the
CUDA toolkit. I've no clue about the Mac OS installers and releases, you'll
have to do the research on that. Let us know if you have further
(GROMACS-related) issues.

Cheers,

--
Szilárd


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU version of GROMACS 4.6 in MacOS cluster

2013-03-01 Thread Szilárd Páll
Hi George,

As I said before, that just means that most probably the GPU driver is not
compatible with the CUDA runtime (libcudart) that you installed with the
CUDA toolkit. I've no clue about the Mac OS installers and releases, you'll
have to do the research on that. Let us know if you have further
(GROMACS-related) issues.

Cheers,

--
Szilárd


On Fri, Mar 1, 2013 at 2:48 PM, George Patargias  wrote:

> Hi Szilαrd
>
> Thanks for your reply. I have run the deviceQuery utility and what I got
> back is
>
> /deviceQuery Starting...
>
>  CUDA Device Query (Runtime API) version (CUDART static linking)
>
> cudaGetDeviceCount returned 38
> -> no CUDA-capable device is detected
>
> Should I understand from this that the CUDA driver was not installed from
> the MAC OS  X CUDA 5.0 Production Release?
>
> George
>
>
> > HI,
> >
> > That looks like the driver does not work or is incompatible with the
> > runtime. Please get the SDK, compile a simple program, e.g. deviceQuery
> > and
> > see if that works (I suspect that it won't).
> >
> > Regarding your machines, just FYI, the Quadro 4000 is a pretty slow card
> > (somewhat slower than a GTX 460) so you'll hava a quite strong resource
> > imbalance: a lot of CPU compute power (2x Xeon 5xxx, right?) and little
> > GPU
> > compute power which will lead to the CPU idling while waiting for the
> GPU.
> >
> > Cheers,
> >
> > --
> > Szilαrd
> >
> >
> > On Thu, Feb 28, 2013 at 4:52 PM, George Patargias
> > wrote:
> >
> >> Hello
> >>
> >> We are trying to install the GPU version of GROMACS 4.6 on our own
> >> MacOS cluster. So for the cluster nodes that have the NVIDIA Quadro 4000
> >> cards:
> >>
> >> - We have downloaded and install the MAC OS  X CUDA 5.0 Production
> >> Release
> >> from here: https://developer.nvidia.com/cuda-downloads
> >>
> >> placing the libraries contained in this download in /usr/local/cuda/lib
> >>
> >> - We have managed to compile GROMACS 4.6 linking it statically with
> >> these
> >> CUDA libraries and the MPI libraries (with BUILD_SHARED_LIBS=OFF and
> >> GMX_PREFER_STATIC_LIBS=ON)
> >>
> >> Unfortunately, when we tried to run a test job with the generated
> >> mdrun_mpi, GROMACS reported that it cannot detect any CUDA-enabled
> >> devices. It also reports 0.0 version for CUDA driver and runtime.
> >>
> >> Is the actual CUDA driver missing from the MAC OS  X CUDA 5.0 Production
> >> Release that we installed? Do we need to install it from here:
> >>
> >> http://www.nvidia.com/object/cuda-mac-driver.html
> >>
> >> Or is something else that we need to do?
> >>
> >> Many thanks in advance.
> >> George
> >>
> >>
> >> Dr. George Patargias
> >> Postdoctoral Researcher
> >> Biomedical Research Foundation
> >> Academy of Athens
> >> 4, Soranou Ephessiou
> >> 115 27
> >> Athens
> >> Greece
> >>
> >> Office: +302106597568
> >>
> >>
> >> --
> >> gmx-users mailing listgmx-users@gromacs.org
> >> http://lists.gromacs.org/mailman/listinfo/gmx-users
> >> * Please search the archive at
> >> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> >> * Please don't post (un)subscribe requests to the list. Use the
> >> www interface or send it to gmx-users-requ...@gromacs.org.
> >> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >>
> > --
> > gmx-users mailing listgmx-users@gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > * Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-requ...@gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
>
>
> Dr. George Patargias
> Postdoctoral Researcher
> Biomedical Research Foundation
> Academy of Athens
> 4, Soranou Ephessiou
> 115 27
> Athens
> Greece
>
> Office: +302106597568
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU version of GROMACS 4.6 in MacOS cluster

2013-03-01 Thread George Patargias
Hi Szilαrd

Thanks for your reply. I have run the deviceQuery utility and what I got
back is

/deviceQuery Starting...

 CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38
-> no CUDA-capable device is detected

Should I understand from this that the CUDA driver was not installed from
the MAC OS  X CUDA 5.0 Production Release?

George


> HI,
>
> That looks like the driver does not work or is incompatible with the
> runtime. Please get the SDK, compile a simple program, e.g. deviceQuery
> and
> see if that works (I suspect that it won't).
>
> Regarding your machines, just FYI, the Quadro 4000 is a pretty slow card
> (somewhat slower than a GTX 460) so you'll hava a quite strong resource
> imbalance: a lot of CPU compute power (2x Xeon 5xxx, right?) and little
> GPU
> compute power which will lead to the CPU idling while waiting for the GPU.
>
> Cheers,
>
> --
> Szilαrd
>
>
> On Thu, Feb 28, 2013 at 4:52 PM, George Patargias
> wrote:
>
>> Hello
>>
>> We are trying to install the GPU version of GROMACS 4.6 on our own
>> MacOS cluster. So for the cluster nodes that have the NVIDIA Quadro 4000
>> cards:
>>
>> - We have downloaded and install the MAC OS  X CUDA 5.0 Production
>> Release
>> from here: https://developer.nvidia.com/cuda-downloads
>>
>> placing the libraries contained in this download in /usr/local/cuda/lib
>>
>> - We have managed to compile GROMACS 4.6 linking it statically with
>> these
>> CUDA libraries and the MPI libraries (with BUILD_SHARED_LIBS=OFF and
>> GMX_PREFER_STATIC_LIBS=ON)
>>
>> Unfortunately, when we tried to run a test job with the generated
>> mdrun_mpi, GROMACS reported that it cannot detect any CUDA-enabled
>> devices. It also reports 0.0 version for CUDA driver and runtime.
>>
>> Is the actual CUDA driver missing from the MAC OS  X CUDA 5.0 Production
>> Release that we installed? Do we need to install it from here:
>>
>> http://www.nvidia.com/object/cuda-mac-driver.html
>>
>> Or is something else that we need to do?
>>
>> Many thanks in advance.
>> George
>>
>>
>> Dr. George Patargias
>> Postdoctoral Researcher
>> Biomedical Research Foundation
>> Academy of Athens
>> 4, Soranou Ephessiou
>> 115 27
>> Athens
>> Greece
>>
>> Office: +302106597568
>>
>>
>> --
>> gmx-users mailing listgmx-users@gromacs.org
>> http://lists.gromacs.org/mailman/listinfo/gmx-users
>> * Please search the archive at
>> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
>> * Please don't post (un)subscribe requests to the list. Use the
>> www interface or send it to gmx-users-requ...@gromacs.org.
>> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>


Dr. George Patargias
Postdoctoral Researcher
Biomedical Research Foundation
Academy of Athens
4, Soranou Ephessiou
115 27
Athens
Greece

Office: +302106597568

-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU version of GROMACS 4.6 in MacOS cluster

2013-03-01 Thread Szilárd Páll
HI,

That looks like the driver does not work or is incompatible with the
runtime. Please get the SDK, compile a simple program, e.g. deviceQuery and
see if that works (I suspect that it won't).

Regarding your machines, just FYI, the Quadro 4000 is a pretty slow card
(somewhat slower than a GTX 460) so you'll hava a quite strong resource
imbalance: a lot of CPU compute power (2x Xeon 5xxx, right?) and little GPU
compute power which will lead to the CPU idling while waiting for the GPU.

Cheers,

--
Szilárd


On Thu, Feb 28, 2013 at 4:52 PM, George Patargias wrote:

> Hello
>
> We are trying to install the GPU version of GROMACS 4.6 on our own
> MacOS cluster. So for the cluster nodes that have the NVIDIA Quadro 4000
> cards:
>
> - We have downloaded and install the MAC OS  X CUDA 5.0 Production Release
> from here: https://developer.nvidia.com/cuda-downloads
>
> placing the libraries contained in this download in /usr/local/cuda/lib
>
> - We have managed to compile GROMACS 4.6 linking it statically with these
> CUDA libraries and the MPI libraries (with BUILD_SHARED_LIBS=OFF and
> GMX_PREFER_STATIC_LIBS=ON)
>
> Unfortunately, when we tried to run a test job with the generated
> mdrun_mpi, GROMACS reported that it cannot detect any CUDA-enabled
> devices. It also reports 0.0 version for CUDA driver and runtime.
>
> Is the actual CUDA driver missing from the MAC OS  X CUDA 5.0 Production
> Release that we installed? Do we need to install it from here:
>
> http://www.nvidia.com/object/cuda-mac-driver.html
>
> Or is something else that we need to do?
>
> Many thanks in advance.
> George
>
>
> Dr. George Patargias
> Postdoctoral Researcher
> Biomedical Research Foundation
> Academy of Athens
> 4, Soranou Ephessiou
> 115 27
> Athens
> Greece
>
> Office: +302106597568
>
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-18 Thread Albert

On 12/17/2012 08:06 PM, Justin Lemkul wrote:
It seems to me that the system is simply crashing like any other that 
becomes unstable.  Does the simulation run at all on plain CPU?


-Justin 



Thank you very much Justin, it's really helpful. I've checked that the 
structure after minization and found that there is some problem with my 
ligand. I regenerated the ligand toplogy with acpype, and resubmit for 
mimization and NVT. Now it goes well. So probably the problems comes 
from the incorrect ligand topolgy which make the system very unstable.


best
Albert
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-17 Thread Justin Lemkul



On 12/17/12 2:03 PM, Albert wrote:

well, that's one of the log files.
I've tried both

VERSION 4.6-dev-20121004-5d6c49d
VERSION 4.6-beta1
VERSION 4.6-beta2
and the latest 5.0 by git.

the problems are the same.:-(



It seems to me that the system is simply crashing like any other that becomes 
unstable.  Does the simulation run at all on plain CPU?


-Justin





On 12/17/2012 07:56 PM, Mark Abraham wrote:

On Mon, Dec 17, 2012 at 6:01 PM, Albert  wrote:


>hello:
>
>  I reduced the GPU to two, and it said:
>
>Back Off! I just backed up nvt.log to ./#nvt.log.1#
>Reading file nvt.tpr, VERSION 4.6-dev-20121004-5d6c49d (single precision)
>

This is a development version from October 1. Please use the mdrun version
you think you're using:-)

Mark
--




--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-17 Thread Albert

well, that's one of the log files.
I've tried both

VERSION 4.6-dev-20121004-5d6c49d
VERSION 4.6-beta1
VERSION 4.6-beta2
and the latest 5.0 by git.

the problems are the same.:-(




On 12/17/2012 07:56 PM, Mark Abraham wrote:

On Mon, Dec 17, 2012 at 6:01 PM, Albert  wrote:


>hello:
>
>  I reduced the GPU to two, and it said:
>
>Back Off! I just backed up nvt.log to ./#nvt.log.1#
>Reading file nvt.tpr, VERSION 4.6-dev-20121004-5d6c49d (single precision)
>

This is a development version from October 1. Please use the mdrun version
you think you're using:-)

Mark
--


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-17 Thread Szilárd Páll
On Mon, Dec 17, 2012 at 7:56 PM, Mark Abraham wrote:

> On Mon, Dec 17, 2012 at 6:01 PM, Albert  wrote:
>
> > hello:
> >
> >  I reduced the GPU to two, and it said:
> >
> > Back Off! I just backed up nvt.log to ./#nvt.log.1#
> > Reading file nvt.tpr, VERSION 4.6-dev-20121004-5d6c49d (single precision)
> >
>
> This is a development version from October 1. Please use the mdrun version
> you think you're using :-)
>

Thanks Mark, good catch!

--
Szilárd



>
> Mark
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-17 Thread Mark Abraham
On Mon, Dec 17, 2012 at 6:01 PM, Albert  wrote:

> hello:
>
>  I reduced the GPU to two, and it said:
>
> Back Off! I just backed up nvt.log to ./#nvt.log.1#
> Reading file nvt.tpr, VERSION 4.6-dev-20121004-5d6c49d (single precision)
>

This is a development version from October 1. Please use the mdrun version
you think you're using :-)

Mark
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-17 Thread Szilárd Páll
Hi Albert,

Thanks for the testing.

Last questions.
- What version are you using? Is it beta2 release or latest git? if it's
the former, getting the latest git might help if...
-  (do) you happen to be using GMX_GPU_ACCELERATION=None (you shouldn't!)?
A bug triggered only with this setting has been fixed recently.

If the above doesn't help, please file a bug report and attach a tpr so we
can reproduce.

Cheers,

--
Szilárd



On Mon, Dec 17, 2012 at 6:21 PM, Albert  wrote:

> On 12/17/2012 06:08 PM, Szilárd Páll wrote:
>
>> Hi,
>>
>> How about GPU emulation or CPU-only runs? Also, please try setting the
>> number of therads to 1 (-ntomp 1).
>>
>>
>> --
>> Szilárd
>>
>>
> hello:
>
> I am running in GPU emulation mode with the GMX_EMULATE_GPU=1 env. var
> set (and to match closer the GPU setup with -ntomp 12), it failed with log:
>
> Back Off! I just backed up step33b.pdb to ./#step33b.pdb.2#
>
> Back Off! I just backed up step33c.pdb to ./#step33c.pdb.2#
>
> Wrote pdb files with previous and current coordinates
> [CUDANodeA:20753] *** Process received signal ***
> [CUDANodeA:20753] Signal: Segmentation fault (11)
> [CUDANodeA:20753] Signal code: Address not mapped (1)
> [CUDANodeA:20753] Failing at address: 0x106ae6a00
>
> [1]Segmentation faultmdrun_mpi -v -s nvt.tpr -c nvt.gro -g
> nvt.log -x nvt.xtc -ntomp 12
>
>
>
>
> I also tried , number of therads to 1 (-ntomp 1), it failed with following
> messages:
>
>
> Back Off! I just backed up step33c.pdb to ./#step33c.pdb.1#
>
> Wrote pdb files with previous and current coordinates
> [CUDANodeA:20740] *** Process received signal ***
> [CUDANodeA:20740] Signal: Segmentation fault (11)
> [CUDANodeA:20740] Signal code: Address not mapped (1)
> [CUDANodeA:20740] Failing at address: 0x1f74a96ec
> [CUDANodeA:20740] [ 0] /lib64/libpthread.so.0(+**0xf2d0) [0x2b351d3022d0]
> [CUDANodeA:20740] [ 1] /opt/gromacs-4.6/lib/libmd_**mpi.so.6(+0x11020f)
> [0x2b351a99c20f]
> [CUDANodeA:20740] [ 2] /opt/gromacs-4.6/lib/libmd_**mpi.so.6(+0x111c94)
> [0x2b351a99dc94]
> [CUDANodeA:20740] [ 3] 
> /opt/gromacs-4.6/lib/libmd_**mpi.so.6(gmx_pme_do+0x1d2e)
> [0x2b351a9a1bae]
> [CUDANodeA:20740] [ 4] /opt/gromacs-4.6/lib/libmd_**
> mpi.so.6(do_force_lowlevel+**0x1eef) [0x2b351a97262f]
> [CUDANodeA:20740] [ 5] /opt/gromacs-4.6/lib/libmd_**
> mpi.so.6(do_force_cutsVERLET+**0x1756) [0x2b351aa04736]
> [CUDANodeA:20740] [ 6] /opt/gromacs-4.6/lib/libmd_**mpi.so.6(do_force+0x3bf)
> [0x2b351aa0a0df]
> [CUDANodeA:20740] [ 7] mdrun_mpi(do_md+0x8133) [0x4334c3]
> [CUDANodeA:20740] [ 8] mdrun_mpi(mdrunner+0x19e9) [0x411639]
> [CUDANodeA:20740] [ 9] mdrun_mpi(main+0x17db) [0x4373db]
> [CUDANodeA:20740] [10] /lib64/libc.so.6(__libc_start_**main+0xfd)
> [0x2b351d52ebfd]
> [CUDANodeA:20740] [11] mdrun_mpi() [0x407f09]
> [CUDANodeA:20740] *** End of error message ***
>
> [1]Segmentation faultmdrun_mpi -v -s nvt.tpr -c nvt.gro -g
> nvt.log -x nvt.xtc -ntomp 1
>
>
>
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/**mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Searchbefore
>  posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read 
> http://www.gromacs.org/**Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-17 Thread Albert

On 12/17/2012 06:08 PM, Szilárd Páll wrote:

Hi,

How about GPU emulation or CPU-only runs? Also, please try setting the
number of therads to 1 (-ntomp 1).


--
Szilárd



hello:

I am running in GPU emulation mode with the GMX_EMULATE_GPU=1 env. var
set (and to match closer the GPU setup with -ntomp 12), it failed with log:

Back Off! I just backed up step33b.pdb to ./#step33b.pdb.2#

Back Off! I just backed up step33c.pdb to ./#step33c.pdb.2#
Wrote pdb files with previous and current coordinates
[CUDANodeA:20753] *** Process received signal ***
[CUDANodeA:20753] Signal: Segmentation fault (11)
[CUDANodeA:20753] Signal code: Address not mapped (1)
[CUDANodeA:20753] Failing at address: 0x106ae6a00

[1]Segmentation faultmdrun_mpi -v -s nvt.tpr -c nvt.gro -g 
nvt.log -x nvt.xtc -ntomp 12




I also tried , number of therads to 1 (-ntomp 1), it failed with following 
messages:


Back Off! I just backed up step33c.pdb to ./#step33c.pdb.1#
Wrote pdb files with previous and current coordinates
[CUDANodeA:20740] *** Process received signal ***
[CUDANodeA:20740] Signal: Segmentation fault (11)
[CUDANodeA:20740] Signal code: Address not mapped (1)
[CUDANodeA:20740] Failing at address: 0x1f74a96ec
[CUDANodeA:20740] [ 0] /lib64/libpthread.so.0(+0xf2d0) [0x2b351d3022d0]
[CUDANodeA:20740] [ 1] /opt/gromacs-4.6/lib/libmd_mpi.so.6(+0x11020f) 
[0x2b351a99c20f]
[CUDANodeA:20740] [ 2] /opt/gromacs-4.6/lib/libmd_mpi.so.6(+0x111c94) 
[0x2b351a99dc94]
[CUDANodeA:20740] [ 3] 
/opt/gromacs-4.6/lib/libmd_mpi.so.6(gmx_pme_do+0x1d2e) [0x2b351a9a1bae]
[CUDANodeA:20740] [ 4] 
/opt/gromacs-4.6/lib/libmd_mpi.so.6(do_force_lowlevel+0x1eef) 
[0x2b351a97262f]
[CUDANodeA:20740] [ 5] 
/opt/gromacs-4.6/lib/libmd_mpi.so.6(do_force_cutsVERLET+0x1756) 
[0x2b351aa04736]
[CUDANodeA:20740] [ 6] 
/opt/gromacs-4.6/lib/libmd_mpi.so.6(do_force+0x3bf) [0x2b351aa0a0df]

[CUDANodeA:20740] [ 7] mdrun_mpi(do_md+0x8133) [0x4334c3]
[CUDANodeA:20740] [ 8] mdrun_mpi(mdrunner+0x19e9) [0x411639]
[CUDANodeA:20740] [ 9] mdrun_mpi(main+0x17db) [0x4373db]
[CUDANodeA:20740] [10] /lib64/libc.so.6(__libc_start_main+0xfd) 
[0x2b351d52ebfd]

[CUDANodeA:20740] [11] mdrun_mpi() [0x407f09]
[CUDANodeA:20740] *** End of error message ***

[1]Segmentation faultmdrun_mpi -v -s nvt.tpr -c nvt.gro 
-g nvt.log -x nvt.xtc -ntomp 1




--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-17 Thread Szilárd Páll
Hi,

How about GPU emulation or CPU-only runs? Also, please try setting the
number of therads to 1 (-ntomp 1).


--
Szilárd



On Mon, Dec 17, 2012 at 6:01 PM, Albert  wrote:

> hello:
>
>  I reduced the GPU to two, and it said:
>
> Back Off! I just backed up nvt.log to ./#nvt.log.1#
> Reading file nvt.tpr, VERSION 4.6-dev-20121004-5d6c49d (single precision)
>
> NOTE: GPU(s) found, but the current simulation can not use GPUs
>   To use a GPU, set the mdp option: cutoff-scheme = Verlet
>   (for quick performance testing you can use the -testverlet option)
>
> Using 2 MPI processes
>
> 4 GPUs detected on host CUDANodeA:
>   #0: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
>   #1: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
>   #2: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
>   #3: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
>
> Making 1D domain decomposition 2 x 1 x 1
>
> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
> We have just committed the new CPU detection code in this branch,
> and will commit new SSE/AVX kernels in a few days. However, this
> means that currently only the NxN kernels are accelerated!
> In the mean time, you might want to avoid production runs in 4.6.
>
>
> when I run it with single GPU, it produced lots of pdb file with prefix
> "step", and then it crashed with messages:
>
> Wrote pdb files with previous and current coordinates
> Warning: 1-4 interaction between 4674 and 4706 at distance 434.986 which
> is larger than the 1-4 table size 2.200 nm
> These are ignored for the rest of the simulation
> This usually means your system is exploding,
> if not, you should increase table-extension in your mdp file
> or with user tables increase the table size
> [CUDANodeA:20659] *** Process received signal ***
> [CUDANodeA:20659] Signal: Segmentation fault (11)
> [CUDANodeA:20659] Signal code: Address not mapped (1)
> [CUDANodeA:20659] Failing at address: 0xc7aa00dc
> [CUDANodeA:20659] [ 0] /lib64/libpthread.so.0(+**0xf2d0) [0x2ab25c76d2d0]
> [CUDANodeA:20659] [ 1] /opt/gromacs-4.6/lib/libmd_**mpi.so.6(+0x11020f)
> [0x2ab259e0720f]
> [CUDANodeA:20659] [ 2] /opt/gromacs-4.6/lib/libmd_**mpi.so.6(+0x111c94)
> [0x2ab259e08c94]
> [CUDANodeA:20659] [ 3] 
> /opt/gromacs-4.6/lib/libmd_**mpi.so.6(gmx_pme_do+0x1d2e)
> [0x2ab259e0cbae]
> [CUDANodeA:20659] [ 4] /opt/gromacs-4.6/lib/libmd_**
> mpi.so.6(do_force_lowlevel+**0x1eef) [0x2ab259ddd62f]
> [CUDANodeA:20659] [ 5] /opt/gromacs-4.6/lib/libmd_**
> mpi.so.6(do_force_cutsGROUP+**0x1495) [0x2ab259e72a45]
> [CUDANodeA:20659] [ 6] mdrun_mpi(do_md+0x8133) [0x4334c3]
> [CUDANodeA:20659] [ 7] mdrun_mpi(mdrunner+0x19e9) [0x411639]
> [CUDANodeA:20659] [ 8] mdrun_mpi(main+0x17db) [0x4373db]
> [CUDANodeA:20659] [ 9] /lib64/libc.so.6(__libc_start_**main+0xfd)
> [0x2ab25c999bfd]
> [CUDANodeA:20659] [10] mdrun_mpi() [0x407f09]
> [CUDANodeA:20659] *** End of error message ***
>
> [1]Segmentation faultmdrun_mpi -v -s nvt.tpr -c nvt.gro -g
> nvt.log -x nvt.xtc
>
>
>
> here is the .mdp file I used:
>
> title   = NVT equilibration for OR-POPC system
> define  = -DPOSRES -DPOSRES_LIG ; Protein is position restrained
> (uses the posres.itp file information)
> ; Parameters describing the details of the NVT simulation protocol
> integrator  = md; Algorithm ("md" = molecular dynamics
> [leap-frog integrator]; "md-vv" = md using velocity verlet; sd = stochastic
> dynamics)
> dt  = 0.002 ; Time-step (ps)
> nsteps  = 25; Number of steps to run (0.002 * 25 =
> 500 ps)
>
> ; Parameters controlling output writing
> nstxout = 0 ; Write coordinates to output .trr file
> every 2 ps
> nstvout = 0 ; Write velocities to output .trr file
> every 2 ps
> nstfout = 0
>
> nstxtcout   = 1000
> nstenergy   = 1000  ; Write energies to output .edr file every
> 2 ps
> nstlog  = 1000  ; Write output to .log file every 2 ps
>
> ; Parameters describing neighbors searching and details about interaction
> calculations
> ns_type = grid  ; Neighbor list search method (simple,
> grid)
> nstlist = 50; Neighbor list update frequency (after
> every given number of steps)
> rlist   = 1.2   ; Neighbor list search cut-off distance
> (nm)
> rlistlong   = 1.4
> rcoulomb= 1.2   ; Short-range Coulombic interactions
> cut-off distance (nm)
> rvdw= 1.2   ; Short-range van der Waals cutoff
> distance (nm)
> pbc = xyz   ; Direction in which to use Perodic
> Boundary Conditions (xyz, xy, no)
> cutoff-scheme   =Verlet  ; GPU running
>
> ; Parameters for treating bonded interactions
> continuation= no; Whether a fresh start or a continuation
> from a previous run (yes/no)
> constraint_algorithm = LINCS

Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-17 Thread Albert

hello:

 I reduced the GPU to two, and it said:

Back Off! I just backed up nvt.log to ./#nvt.log.1#
Reading file nvt.tpr, VERSION 4.6-dev-20121004-5d6c49d (single precision)

NOTE: GPU(s) found, but the current simulation can not use GPUs
  To use a GPU, set the mdp option: cutoff-scheme = Verlet
  (for quick performance testing you can use the -testverlet option)

Using 2 MPI processes

4 GPUs detected on host CUDANodeA:
  #0: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
  #1: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
  #2: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible
  #3: NVIDIA GeForce GTX 590, compute cap.: 2.0, ECC:  no, stat: compatible

Making 1D domain decomposition 2 x 1 x 1

* WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
We have just committed the new CPU detection code in this branch,
and will commit new SSE/AVX kernels in a few days. However, this
means that currently only the NxN kernels are accelerated!
In the mean time, you might want to avoid production runs in 4.6.


when I run it with single GPU, it produced lots of pdb file with prefix 
"step", and then it crashed with messages:


Wrote pdb files with previous and current coordinates
Warning: 1-4 interaction between 4674 and 4706 at distance 434.986 which 
is larger than the 1-4 table size 2.200 nm

These are ignored for the rest of the simulation
This usually means your system is exploding,
if not, you should increase table-extension in your mdp file
or with user tables increase the table size
[CUDANodeA:20659] *** Process received signal ***
[CUDANodeA:20659] Signal: Segmentation fault (11)
[CUDANodeA:20659] Signal code: Address not mapped (1)
[CUDANodeA:20659] Failing at address: 0xc7aa00dc
[CUDANodeA:20659] [ 0] /lib64/libpthread.so.0(+0xf2d0) [0x2ab25c76d2d0]
[CUDANodeA:20659] [ 1] /opt/gromacs-4.6/lib/libmd_mpi.so.6(+0x11020f) 
[0x2ab259e0720f]
[CUDANodeA:20659] [ 2] /opt/gromacs-4.6/lib/libmd_mpi.so.6(+0x111c94) 
[0x2ab259e08c94]
[CUDANodeA:20659] [ 3] 
/opt/gromacs-4.6/lib/libmd_mpi.so.6(gmx_pme_do+0x1d2e) [0x2ab259e0cbae]
[CUDANodeA:20659] [ 4] 
/opt/gromacs-4.6/lib/libmd_mpi.so.6(do_force_lowlevel+0x1eef) 
[0x2ab259ddd62f]
[CUDANodeA:20659] [ 5] 
/opt/gromacs-4.6/lib/libmd_mpi.so.6(do_force_cutsGROUP+0x1495) 
[0x2ab259e72a45]

[CUDANodeA:20659] [ 6] mdrun_mpi(do_md+0x8133) [0x4334c3]
[CUDANodeA:20659] [ 7] mdrun_mpi(mdrunner+0x19e9) [0x411639]
[CUDANodeA:20659] [ 8] mdrun_mpi(main+0x17db) [0x4373db]
[CUDANodeA:20659] [ 9] /lib64/libc.so.6(__libc_start_main+0xfd) 
[0x2ab25c999bfd]

[CUDANodeA:20659] [10] mdrun_mpi() [0x407f09]
[CUDANodeA:20659] *** End of error message ***

[1]Segmentation faultmdrun_mpi -v -s nvt.tpr -c nvt.gro 
-g nvt.log -x nvt.xtc




here is the .mdp file I used:

title   = NVT equilibration for OR-POPC system
define  = -DPOSRES -DPOSRES_LIG ; Protein is position restrained 
(uses the posres.itp file information)

; Parameters describing the details of the NVT simulation protocol
integrator  = md; Algorithm ("md" = molecular dynamics 
[leap-frog integrator]; "md-vv" = md using velocity verlet; sd = 
stochastic dynamics)

dt  = 0.002 ; Time-step (ps)
nsteps  = 25; Number of steps to run (0.002 * 25 
= 500 ps)


; Parameters controlling output writing
nstxout = 0 ; Write coordinates to output .trr file 
every 2 ps
nstvout = 0 ; Write velocities to output .trr file 
every 2 ps

nstfout = 0

nstxtcout   = 1000
nstenergy   = 1000  ; Write energies to output .edr file 
every 2 ps

nstlog  = 1000  ; Write output to .log file every 2 ps

; Parameters describing neighbors searching and details about 
interaction calculations

ns_type = grid  ; Neighbor list search method (simple, grid)
nstlist = 50; Neighbor list update frequency (after 
every given number of steps)

rlist   = 1.2   ; Neighbor list search cut-off distance (nm)
rlistlong   = 1.4
rcoulomb= 1.2   ; Short-range Coulombic interactions 
cut-off distance (nm)
rvdw= 1.2   ; Short-range van der Waals cutoff 
distance (nm)
pbc = xyz   ; Direction in which to use Perodic 
Boundary Conditions (xyz, xy, no)

cutoff-scheme   =Verlet  ; GPU running

; Parameters for treating bonded interactions
continuation= no; Whether a fresh start or a 
continuation from a previous run (yes/no)

constraint_algorithm = LINCS; Constraint algorithm (LINCS / SHAKE)
constraints = all-bonds ; Which bonds/angles to constrain 
(all-bonds / hbonds / none / all-angles / h-angles)
lincs_iter  = 1 ; Number of iterations to correct for 
rotational lengthening in LINCS (related to accuracy)
lincs_order = 4 ; Highest order in the expansion of the 
const

Re: [gmx-users] GPU running problem with GMX-4.6 beta2

2012-12-17 Thread Szilárd Páll
Hi,

That unfortunately tell exactly about the reason why mdrun is stuck. Can
you reproduce the issue on another machines or with different launch
configurations? At which step does it get stuck (-stepout 1 can help)?

Please try the following:
- try running on a single GPU;
- try running on CPUs only (-nb cpu and to match closer the GPU setup with
-ntomp 12);
- try running in GPU emulation mode with the GMX_EMULATE_GPU=1 env. var
set (and to match closer the GPU setup with -ntomp 12)
- provide a backtrace (using gdb).

Cheers,

--
Szilárd



On Mon, Dec 17, 2012 at 5:37 PM, Albert  wrote:

> hello:
>
>  I am running GMX-4.6 beta2 GPU work in a 24 CPU core workstation with two
> GTX590, it stacked there without any output i.e the .xtc file size is
> always 0 after hours of running. Here is the md.log file I found:
>
>
> Using CUDA 8x8x8 non-bonded kernels
>
> Potential shift: LJ r^-12: 0.112 r^-6 0.335, Ewald 1.000e-05
> Initialized non-bonded Ewald correction tables, spacing: 7.82e-04 size:
> 1536
>
> Removing pbc first time
> Pinning to Hyper-Threading cores with 12 physical cores in a compute node
> There are 1 flexible constraints
>
> WARNING: step size for flexible constraining = 0
>  All flexible constraints will be rigid.
>  Will try to keep all flexible constraints at their original
> length,
>  but the lengths may exhibit some drift.
>
> Initializing Parallel LINear Constraint Solver
> Linking all bonded interactions to atoms
> There are 161872 inter charge-group exclusions,
> will use an extra communication step for exclusion forces for PME
>
> The initial number of communication pulses is: X 1
> The initial domain decomposition cell size is: X 1.83 nm
>
> The maximum allowed distance for charge groups involved in interactions is:
>  non-bonded interactions   1.200 nm
> (the following are initial values, they could change due to box
> deformation)
> two-body bonded interactions  (-rdd)   1.200 nm
>   multi-body bonded interactions  (-rdd)   1.200 nm
>   atoms separated by up to 5 constraints  (-rcon)  1.826 nm
>
> When dynamic load balancing gets turned on, these settings will change to:
> The maximum number of communication pulses is: X 1
> The minimum size for domain decomposition cells is 1.200 nm
> The requested allowed shrink of DD cells (option -dds) is: 0.80
> The allowed shrink of domain decomposition cells is: X 0.66
> The maximum allowed distance for charge groups involved in interactions is:
>  non-bonded interactions   1.200 nm
> two-body bonded interactions  (-rdd)   1.200 nm
>   multi-body bonded interactions  (-rdd)   1.200 nm
>   atoms separated by up to 5 constraints  (-rcon)  1.200 nm
>
> Making 1D domain decomposition grid 4 x 1 x 1, home cell index 0 0 0
>
> Center of mass motion removal mode is Linear
> We have the following groups for center of mass motion removal:
>   0:  Protein_LIG_POPC
>   1:  Water_and_ions
>
>  PLEASE READ AND CITE THE FOLLOWING REFERENCE 
> G. Bussi, D. Donadio and M. Parrinello
> Canonical sampling through velocity rescaling
> J. Chem. Phys. 126 (2007) pp. 014101
>   --- Thank You ---  
>
>
>
> THX
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/**mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Searchbefore
>  posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read 
> http://www.gromacs.org/**Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-12-11 Thread Szilárd Páll
On Tue, Dec 11, 2012 at 6:49 PM, Mirco Wahab <
mirco.wa...@chemie.tu-freiberg.de> wrote:

> Am 11.12.2012 16:04, schrieb Szilárd Páll:
>
>  It looks like some gcc 4.7-s don't work with CUDA, although I've been
>> using
>> various Ubuntu/Linaro versions, most recently 4.7.2 and had no
>> issues whatsoever. Some people seem to have bumped into the same problem
>> (see http://goo.gl/1onBz or http://goo.gl/JEnuk) and the suggested fix is
>> to put
>> #undef _GLIBCXX_ATOMIC_BUILTINS
>> #undef _GLIBCXX_USE_INT128
>> in a header and pre-include it for nvcc by calling it like this:
>> nvcc --pre-include undef_atomics_int128.h
>>
>
> The same problem occurs in SuSE 12.2/x64 with it's default 4.7.2
> (20120920).
>
> Another possible fix on SuSE 12.2: install the (older) gcc repository
> from 12.1/x64 (with lower priority), install the gcc/g++ 4.6 from there
> as an alternative compiler and select the "active gcc" through the
> "update-alternatives --config gcc" mechanism. This works very well.
>

Thanks for the info. The Ubuntu/Linaro version must have a fix for
this. Unfortunately, we can't do much about it and gcc 4.7 is anyway
blocked by the CUDA 5.0 headers.

FYI: Verlet scheme nonbonded kernels (and probably the group scheme as
well), especially with AVX, can be quite a bit slower with older gcc
versions.

I find it really annoying (and stupid) that NVIDIA did not fix their
compiler to work with gcc 4.7 which had already been out for almost a half
a year at the time of the CUDA 5.0 release.

--
Szilárd


>
> Regards
>
> M.
>
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/**mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Searchbefore
>  posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read 
> http://www.gromacs.org/**Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-12-11 Thread Mirco Wahab

Am 11.12.2012 16:04, schrieb Szilárd Páll:

It looks like some gcc 4.7-s don't work with CUDA, although I've been using
various Ubuntu/Linaro versions, most recently 4.7.2 and had no
issues whatsoever. Some people seem to have bumped into the same problem
(see http://goo.gl/1onBz or http://goo.gl/JEnuk) and the suggested fix is
to put
#undef _GLIBCXX_ATOMIC_BUILTINS
#undef _GLIBCXX_USE_INT128
in a header and pre-include it for nvcc by calling it like this:
nvcc --pre-include undef_atomics_int128.h


The same problem occurs in SuSE 12.2/x64 with it's default 4.7.2
(20120920).

Another possible fix on SuSE 12.2: install the (older) gcc repository
from 12.1/x64 (with lower priority), install the gcc/g++ 4.6 from there
as an alternative compiler and select the "active gcc" through the
"update-alternatives --config gcc" mechanism. This works very well.

Regards

M.

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-12-11 Thread Szilárd Páll
Hi Thomas,

It looks like some gcc 4.7-s don't work with CUDA, although I've been using
various Ubuntu/Linaro versions, most recently 4.7.2 and had no
issues whatsoever. Some people seem to have bumped into the same problem
(see http://goo.gl/1onBz or http://goo.gl/JEnuk) and the suggested fix is
to put
#undef _GLIBCXX_ATOMIC_BUILTINS
#undef _GLIBCXX_USE_INT128
in a header and pre-include it for nvcc by calling it like this:
nvcc --pre-include undef_atomics_int128.h

Cheers,

--
Szilárd



On Sun, Dec 9, 2012 at 12:18 PM, Thomas Evangelidis wrote:

> > > gcc 4.7.2 is not supported by any CUDA version.
> > >
> >
> > I suggest that you just fix it by editing the include/host_config.h and
> > changing the version check macro (line 82 AFAIK). I've never had real
> > problems with using new and officially not supported gcc-s, the version
> > check is more of a promise from NVIDIA that "we've tested thoroughly
> > internally and we more or less vouch for thins combination".
> >
> > Cheers,
> > --
> > Szilárd
> >
> > PS:
> > Disclamer: I don't take responsibility if your machine goes up in flames!
> > ;)
> >
> >
> Hi Szilárd,,
>
> I tried to compile gromacs-4.6beta1, is this the version you suggested? If
> not, please indicate how to download the source cause I am confused with
> all these development versions.
>
> Anyway, this is the error I get with 4.6beta1, gcc 4.7.2 and cuda 5:
>
> [  0%] Building NVCC (Device) object
>
> src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir//./cuda_tools_generated_cudautils.cu.o
>
> /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../include/c++/4.7.2/ext/atomicity.h(48):
> error: identifier "__atomic_fetch_add" is undefined
>
>
> /usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../include/c++/4.7.2/ext/atomicity.h(52):
> error: identifier "__atomic_fetch_add" is undefined
>
> 2 errors detected in the compilation of
> "/tmp/tmpxft_2394_-9_cudautils.compute_30.cpp1.ii".
> CMake Error at cuda_tools_generated_cudautils.cu.o.cmake:252 (message):
>   Error generating file
>
>
> /home/thomas/Programs/gromacs-4.6-beta1_gnu_cuda5_build/src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir//./cuda_tools_generated_cudautils.cu.o
>
>
> gmake[3]: ***
>
> [src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir/./cuda_tools_generated_cudautils.cu.o]
> Error 1
> gmake[2]: *** [src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir/all]
> Error 2
> gmake[1]: *** [src/programs/mdrun/CMakeFiles/mdrun.dir/rule] Error 2
> gmake: *** [mdrun] Error 2
>
>
> Unless I am missing something, cuda 5 does not support gcc 4.7.2.
>
>
>  Thomas
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU compatibility

2012-12-10 Thread Mark Abraham
Correct, C1060 does not have the CUDA 2.0 compute capability required for
GROMACS 4.6. We will not have the ability to support GPU cards of lower
capability in the future. Unfortunately, your only GROMACS options are
probably to use the OpenMM functionality in 4.5.x (which is still present
in 4.6, works as far as we know, but is not in our regular test suite and
the feature is probably headed for deprecation). This will not perform as
well as the new native GPU acceleration, and supports a smaller range of
features, but might be better than wasting the GPUs.

Regards,

Mark

On Mon, Dec 10, 2012 at 7:50 AM, Cara Kreck  wrote:

>
>
>
>
> Hi,
>
> We've got a GPU cluster in our group and have really been looking forward
> to running gromacs on it with full functionality. Unfortunately, it looks
> like our NVIDIA Tesla C1060 cards aren't supported by the 4.6 beta. I was
> just wondering if there was any chance that they would be supported in the
> full version? These cards are only a couple of years old now and were
> bought specifically for running MD.
>
> Thanks,
>
> Cara
>
>   --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-12-09 Thread Thomas Evangelidis
> > gcc 4.7.2 is not supported by any CUDA version.
> >
>
> I suggest that you just fix it by editing the include/host_config.h and
> changing the version check macro (line 82 AFAIK). I've never had real
> problems with using new and officially not supported gcc-s, the version
> check is more of a promise from NVIDIA that "we've tested thoroughly
> internally and we more or less vouch for thins combination".
>
> Cheers,
> --
> Szilárd
>
> PS:
> Disclamer: I don't take responsibility if your machine goes up in flames!
> ;)
>
>
Hi Szilárd,,

I tried to compile gromacs-4.6beta1, is this the version you suggested? If
not, please indicate how to download the source cause I am confused with
all these development versions.

Anyway, this is the error I get with 4.6beta1, gcc 4.7.2 and cuda 5:

[  0%] Building NVCC (Device) object
src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir//./cuda_tools_generated_cudautils.cu.o
/usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../include/c++/4.7.2/ext/atomicity.h(48):
error: identifier "__atomic_fetch_add" is undefined

/usr/lib/gcc/x86_64-redhat-linux/4.7.2/../../../../include/c++/4.7.2/ext/atomicity.h(52):
error: identifier "__atomic_fetch_add" is undefined

2 errors detected in the compilation of
"/tmp/tmpxft_2394_-9_cudautils.compute_30.cpp1.ii".
CMake Error at cuda_tools_generated_cudautils.cu.o.cmake:252 (message):
  Error generating file

/home/thomas/Programs/gromacs-4.6-beta1_gnu_cuda5_build/src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir//./cuda_tools_generated_cudautils.cu.o


gmake[3]: ***
[src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir/./cuda_tools_generated_cudautils.cu.o]
Error 1
gmake[2]: *** [src/gromacs/gmxlib/cuda_tools/CMakeFiles/cuda_tools.dir/all]
Error 2
gmake[1]: *** [src/programs/mdrun/CMakeFiles/mdrun.dir/rule] Error 2
gmake: *** [mdrun] Error 2


Unless I am missing something, cuda 5 does not support gcc 4.7.2.


 Thomas
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-26 Thread Szilárd Páll
On Sun, Nov 25, 2012 at 8:47 PM, Thomas Evangelidis wrote:

> Hi Szilárd,
>
> I was able to run code compiled with icc 13 on Fedora 17, but as I don't
> > have Intel Compiler v13 on this machine I can't check it now.
> >
> > Please check if it works for you with gcc 4.7.2 (which is the default)
> and
> > let me know if you succeed. The performance difference between icc and
> gcc
> > on your processor should be negligible with GPU runs and at most 5-10%
> with
> > CPU-only runs.
> >
> > As the issue is quite annoying, I'll try to have a look later, probably
> > after the beta is out.
> >
> >
> gcc 4.7.2 is not supported by any CUDA version.
>

I suggest that you just fix it by editing the include/host_config.h and
changing the version check macro (line 82 AFAIK). I've never had real
problems with using new and officially not supported gcc-s, the version
check is more of a promise from NVIDIA that "we've tested thoroughly
internally and we more or less vouch for thins combination".

Cheers,
--
Szilárd

PS:
Disclamer: I don't take responsibility if your machine goes up in flames! ;)


>
> Thomas
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-25 Thread Thomas Evangelidis
Hi Szilárd,

I was able to run code compiled with icc 13 on Fedora 17, but as I don't
> have Intel Compiler v13 on this machine I can't check it now.
>
> Please check if it works for you with gcc 4.7.2 (which is the default) and
> let me know if you succeed. The performance difference between icc and gcc
> on your processor should be negligible with GPU runs and at most 5-10% with
> CPU-only runs.
>
> As the issue is quite annoying, I'll try to have a look later, probably
> after the beta is out.
>
>
gcc 4.7.2 is not supported by any CUDA version.

Thomas
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-21 Thread Szilárd Páll
On Mon, Nov 19, 2012 at 6:25 PM, Szilárd Páll wrote:

> On Mon, Nov 19, 2012 at 4:09 PM, Thomas Evangelidis wrote:
>
>> Hi Szilárd,
>>
>> I compiled with the Intel compilers, not gcc. In case I am missing
>> something, these are the versions I have:
>>
>
> Indeed, I see it now in the log file. Let me try with icc 13 and will get
> back to you.
>

I was able to run code compiled with icc 13 on Fedora 17, but as I don't
have Intel Compiler v13 on this machine I can't check it now.

Please check if it works for you with gcc 4.7.2 (which is the default) and
let me know if you succeed. The performance difference between icc and gcc
on your processor should be negligible with GPU runs and at most 5-10% with
CPU-only runs.

As the issue is quite annoying, I'll try to have a look later, probably
after the beta is out.

Cheers,
Sz.


>
>> glibc.i6862.15-57.fc17
>> @updates
>> glibc.x86_64  2.15-57.fc17
>> @updates
>> glibc-common.x86_64   2.15-57.fc17
>> @updates
>> glibc-devel.i686  2.15-57.fc17
>> @updates
>> glibc-devel.x86_642.15-57.fc17
>> @updates
>> glibc-headers.x86_64  2.15-57.fc17   @updates
>>
>> gcc.x86_644.7.2-2.fc17
>> @updates
>> gcc-c++.x86_644.7.2-2.fc17
>> @updates
>> gcc-gfortran.x86_64   4.7.2-2.fc17
>> @updates
>> libgcc.i686   4.7.2-2.fc17
>> @updates
>> libgcc.x86_64 4.7.2-2.fc17   @updates
>>
>>
>> Thomas
>>
>>
>>
>> On 19 November 2012 16:57, Szilárd Páll  wrote:
>>
>> > Thomas & Albert,
>> >
>> > We are unable to reproduce the issue on FC 17 with glibc 2.15-58 and gcc
>> > 4.7.2.
>> >
>> > Please try to update your packages (you should have updates available
>> for
>> > glibc), try recompiling with the latest 4.6 code and report back whether
>> > you succeed.
>> >
>> > Cheers,
>> >
>> > --
>> > Szilárd
>> >
>> >
>> > On Fri, Nov 16, 2012 at 4:31 PM, Szilárd Páll > > >wrote:
>> >
>> > > Hi Albert,
>> > >
>> > > Apologies for hijacking your thread. Do you happen to have Fedora 17
>> as
>> > > well?
>> > >
>> > > --
>> > > Szilárd
>> > >
>> > >
>> > >
>> > > On Sun, Nov 4, 2012 at 10:55 AM, Albert  wrote:
>> > >
>> > >> hello:
>> > >>
>> > >>  I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti
>> (2 x
>> > >> 1344 CUDA cores), and I got the following warnings:
>> > >>
>> > >> thank you very much.
>> > >>
>> > >> ---**messages--**
>> > >> -
>> > >>
>> > >> WARNING: On node 0: oversubscribing the available 0 logical CPU cores
>> > per
>> > >> node with 2 MPI processes.
>> > >>  This will cause considerable performance loss!
>> > >>
>> > >> 2 GPUs detected on host boreas:
>> > >>   #0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
>> > >> compatible
>> > >>   #1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
>> > >> compatible
>> > >>
>> > >> 2 GPUs auto-selected to be used for this run: #0, #1
>> > >>
>> > >> Using CUDA 8x8x8 non-bonded kernels
>> > >> Making 1D domain decomposition 1 x 2 x 1
>> > >>
>> > >> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
>> > >> We have just committed the new CPU detection code in this branch,
>> > >> and will commit new SSE/AVX kernels in a few days. However, this
>> > >> means that currently only the NxN kernels are accelerated!
>> > >> In the mean time, you might want to avoid production runs in 4.6.
>> > >>
>> > >> --
>> > >> gmx-users mailing listgmx-users@gromacs.org
>> > >> http://lists.gromacs.org/**mailman/listinfo/gmx-users<
>> > http://lists.gromacs.org/mailman/listinfo/gmx-users>
>> > >> * Please search the archive at http://www.gromacs.org/**
>> > >> Support/Mailing_Lists/Search<
>> > http://www.gromacs.org/Support/Mailing_Lists/Search>before posting!
>> > >> * Please don't post (un)subscribe requests to the list. Use the www
>> > >> interface or send it to gmx-users-requ...@gromacs.org.
>> > >> * Can't post? Read http://www.gromacs.org/**Support/Mailing_Lists<
>> > http://www.gromacs.org/Support/Mailing_Lists>
>> > >>
>> > >
>> > >
>> > --
>> > gmx-users mailing listgmx-users@gromacs.org
>> > http://lists.gromacs.org/mailman/listinfo/gmx-users
>> > * Please search the archive at
>> > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
>> > * Please don't post (un)subscribe requests to the list. Use the
>> > www interface or send it to gmx-users-requ...@gromacs.org.
>> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>> >
>>
>>
>>
>> --
>>
>> ==
>>
>> Thomas Evangelidis
>>
>> PhD student
>> University of Athens
>> Faculty of Pharmacy
>> Department of Pharmaceutical Chemistry
>> Panepistimioupoli-Zografou
>> 157 71 Athens
>> GREECE
>>
>> email: tev...@pharm.uo

Re: [gmx-users] GPU warnings

2012-11-19 Thread Szilárd Páll
On Mon, Nov 19, 2012 at 4:09 PM, Thomas Evangelidis wrote:

> Hi Szilárd,
>
> I compiled with the Intel compilers, not gcc. In case I am missing
> something, these are the versions I have:
>

Indeed, I see it now in the log file. Let me try with icc 13 and will get
back to you.

>
> glibc.i6862.15-57.fc17
> @updates
> glibc.x86_64  2.15-57.fc17
> @updates
> glibc-common.x86_64   2.15-57.fc17
> @updates
> glibc-devel.i686  2.15-57.fc17
> @updates
> glibc-devel.x86_642.15-57.fc17
> @updates
> glibc-headers.x86_64  2.15-57.fc17   @updates
>
> gcc.x86_644.7.2-2.fc17
> @updates
> gcc-c++.x86_644.7.2-2.fc17
> @updates
> gcc-gfortran.x86_64   4.7.2-2.fc17
> @updates
> libgcc.i686   4.7.2-2.fc17
> @updates
> libgcc.x86_64 4.7.2-2.fc17   @updates
>
>
> Thomas
>
>
>
> On 19 November 2012 16:57, Szilárd Páll  wrote:
>
> > Thomas & Albert,
> >
> > We are unable to reproduce the issue on FC 17 with glibc 2.15-58 and gcc
> > 4.7.2.
> >
> > Please try to update your packages (you should have updates available for
> > glibc), try recompiling with the latest 4.6 code and report back whether
> > you succeed.
> >
> > Cheers,
> >
> > --
> > Szilárd
> >
> >
> > On Fri, Nov 16, 2012 at 4:31 PM, Szilárd Páll  > >wrote:
> >
> > > Hi Albert,
> > >
> > > Apologies for hijacking your thread. Do you happen to have Fedora 17 as
> > > well?
> > >
> > > --
> > > Szilárd
> > >
> > >
> > >
> > > On Sun, Nov 4, 2012 at 10:55 AM, Albert  wrote:
> > >
> > >> hello:
> > >>
> > >>  I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti (2
> x
> > >> 1344 CUDA cores), and I got the following warnings:
> > >>
> > >> thank you very much.
> > >>
> > >> ---**messages--**
> > >> -
> > >>
> > >> WARNING: On node 0: oversubscribing the available 0 logical CPU cores
> > per
> > >> node with 2 MPI processes.
> > >>  This will cause considerable performance loss!
> > >>
> > >> 2 GPUs detected on host boreas:
> > >>   #0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> > >> compatible
> > >>   #1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> > >> compatible
> > >>
> > >> 2 GPUs auto-selected to be used for this run: #0, #1
> > >>
> > >> Using CUDA 8x8x8 non-bonded kernels
> > >> Making 1D domain decomposition 1 x 2 x 1
> > >>
> > >> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
> > >> We have just committed the new CPU detection code in this branch,
> > >> and will commit new SSE/AVX kernels in a few days. However, this
> > >> means that currently only the NxN kernels are accelerated!
> > >> In the mean time, you might want to avoid production runs in 4.6.
> > >>
> > >> --
> > >> gmx-users mailing listgmx-users@gromacs.org
> > >> http://lists.gromacs.org/**mailman/listinfo/gmx-users<
> > http://lists.gromacs.org/mailman/listinfo/gmx-users>
> > >> * Please search the archive at http://www.gromacs.org/**
> > >> Support/Mailing_Lists/Search<
> > http://www.gromacs.org/Support/Mailing_Lists/Search>before posting!
> > >> * Please don't post (un)subscribe requests to the list. Use the www
> > >> interface or send it to gmx-users-requ...@gromacs.org.
> > >> * Can't post? Read http://www.gromacs.org/**Support/Mailing_Lists<
> > http://www.gromacs.org/Support/Mailing_Lists>
> > >>
> > >
> > >
> > --
> > gmx-users mailing listgmx-users@gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > * Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-requ...@gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
>
>
>
> --
>
> ==
>
> Thomas Evangelidis
>
> PhD student
> University of Athens
> Faculty of Pharmacy
> Department of Pharmaceutical Chemistry
> Panepistimioupoli-Zografou
> 157 71 Athens
> GREECE
>
> email: tev...@pharm.uoa.gr
>
>   teva...@gmail.com
>
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/

Re: [gmx-users] GPU warnings

2012-11-19 Thread Thomas Evangelidis
Hi Szilárd,

I compiled with the Intel compilers, not gcc. In case I am missing
something, these are the versions I have:

glibc.i6862.15-57.fc17
@updates
glibc.x86_64  2.15-57.fc17
@updates
glibc-common.x86_64   2.15-57.fc17
@updates
glibc-devel.i686  2.15-57.fc17
@updates
glibc-devel.x86_642.15-57.fc17
@updates
glibc-headers.x86_64  2.15-57.fc17   @updates

gcc.x86_644.7.2-2.fc17
@updates
gcc-c++.x86_644.7.2-2.fc17
@updates
gcc-gfortran.x86_64   4.7.2-2.fc17
@updates
libgcc.i686   4.7.2-2.fc17
@updates
libgcc.x86_64 4.7.2-2.fc17   @updates


Thomas



On 19 November 2012 16:57, Szilárd Páll  wrote:

> Thomas & Albert,
>
> We are unable to reproduce the issue on FC 17 with glibc 2.15-58 and gcc
> 4.7.2.
>
> Please try to update your packages (you should have updates available for
> glibc), try recompiling with the latest 4.6 code and report back whether
> you succeed.
>
> Cheers,
>
> --
> Szilárd
>
>
> On Fri, Nov 16, 2012 at 4:31 PM, Szilárd Páll  >wrote:
>
> > Hi Albert,
> >
> > Apologies for hijacking your thread. Do you happen to have Fedora 17 as
> > well?
> >
> > --
> > Szilárd
> >
> >
> >
> > On Sun, Nov 4, 2012 at 10:55 AM, Albert  wrote:
> >
> >> hello:
> >>
> >>  I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti (2 x
> >> 1344 CUDA cores), and I got the following warnings:
> >>
> >> thank you very much.
> >>
> >> ---**messages--**
> >> -
> >>
> >> WARNING: On node 0: oversubscribing the available 0 logical CPU cores
> per
> >> node with 2 MPI processes.
> >>  This will cause considerable performance loss!
> >>
> >> 2 GPUs detected on host boreas:
> >>   #0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> >> compatible
> >>   #1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> >> compatible
> >>
> >> 2 GPUs auto-selected to be used for this run: #0, #1
> >>
> >> Using CUDA 8x8x8 non-bonded kernels
> >> Making 1D domain decomposition 1 x 2 x 1
> >>
> >> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
> >> We have just committed the new CPU detection code in this branch,
> >> and will commit new SSE/AVX kernels in a few days. However, this
> >> means that currently only the NxN kernels are accelerated!
> >> In the mean time, you might want to avoid production runs in 4.6.
> >>
> >> --
> >> gmx-users mailing listgmx-users@gromacs.org
> >> http://lists.gromacs.org/**mailman/listinfo/gmx-users<
> http://lists.gromacs.org/mailman/listinfo/gmx-users>
> >> * Please search the archive at http://www.gromacs.org/**
> >> Support/Mailing_Lists/Search<
> http://www.gromacs.org/Support/Mailing_Lists/Search>before posting!
> >> * Please don't post (un)subscribe requests to the list. Use the www
> >> interface or send it to gmx-users-requ...@gromacs.org.
> >> * Can't post? Read http://www.gromacs.org/**Support/Mailing_Lists<
> http://www.gromacs.org/Support/Mailing_Lists>
> >>
> >
> >
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>



-- 

==

Thomas Evangelidis

PhD student
University of Athens
Faculty of Pharmacy
Department of Pharmaceutical Chemistry
Panepistimioupoli-Zografou
157 71 Athens
GREECE

email: tev...@pharm.uoa.gr

  teva...@gmail.com


website: https://sites.google.com/site/thomasevangelidishomepage/
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-19 Thread Szilárd Páll
Thomas & Albert,

We are unable to reproduce the issue on FC 17 with glibc 2.15-58 and gcc
4.7.2.

Please try to update your packages (you should have updates available for
glibc), try recompiling with the latest 4.6 code and report back whether
you succeed.

Cheers,

--
Szilárd


On Fri, Nov 16, 2012 at 4:31 PM, Szilárd Páll wrote:

> Hi Albert,
>
> Apologies for hijacking your thread. Do you happen to have Fedora 17 as
> well?
>
> --
> Szilárd
>
>
>
> On Sun, Nov 4, 2012 at 10:55 AM, Albert  wrote:
>
>> hello:
>>
>>  I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti (2 x
>> 1344 CUDA cores), and I got the following warnings:
>>
>> thank you very much.
>>
>> ---**messages--**
>> -
>>
>> WARNING: On node 0: oversubscribing the available 0 logical CPU cores per
>> node with 2 MPI processes.
>>  This will cause considerable performance loss!
>>
>> 2 GPUs detected on host boreas:
>>   #0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
>> compatible
>>   #1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
>> compatible
>>
>> 2 GPUs auto-selected to be used for this run: #0, #1
>>
>> Using CUDA 8x8x8 non-bonded kernels
>> Making 1D domain decomposition 1 x 2 x 1
>>
>> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
>> We have just committed the new CPU detection code in this branch,
>> and will commit new SSE/AVX kernels in a few days. However, this
>> means that currently only the NxN kernels are accelerated!
>> In the mean time, you might want to avoid production runs in 4.6.
>>
>> --
>> gmx-users mailing listgmx-users@gromacs.org
>> http://lists.gromacs.org/**mailman/listinfo/gmx-users
>> * Please search the archive at http://www.gromacs.org/**
>> Support/Mailing_Lists/Searchbefore
>>  posting!
>> * Please don't post (un)subscribe requests to the list. Use the www
>> interface or send it to gmx-users-requ...@gromacs.org.
>> * Can't post? Read 
>> http://www.gromacs.org/**Support/Mailing_Lists
>>
>
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-16 Thread Szilárd Páll
Hi Albert,

Apologies for hijacking your thread. Do you happen to have Fedora 17 as
well?

--
Szilárd


On Sun, Nov 4, 2012 at 10:55 AM, Albert  wrote:

> hello:
>
>  I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti (2 x
> 1344 CUDA cores), and I got the following warnings:
>
> thank you very much.
>
> ---**messages--**-
>
> WARNING: On node 0: oversubscribing the available 0 logical CPU cores per
> node with 2 MPI processes.
>  This will cause considerable performance loss!
>
> 2 GPUs detected on host boreas:
>   #0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> compatible
>   #1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> compatible
>
> 2 GPUs auto-selected to be used for this run: #0, #1
>
> Using CUDA 8x8x8 non-bonded kernels
> Making 1D domain decomposition 1 x 2 x 1
>
> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
> We have just committed the new CPU detection code in this branch,
> and will commit new SSE/AVX kernels in a few days. However, this
> means that currently only the NxN kernels are accelerated!
> In the mean time, you might want to avoid production runs in 4.6.
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/**mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Searchbefore
>  posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read 
> http://www.gromacs.org/**Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-16 Thread Szilárd Páll
Hi Thomas,

The output you get means that you don't have any of the macros we try to
use although your man pages seem to be referring to them. Hence, I'm really
clueless why is this happening. Could you please file a bug report on
redmine.gromacs.org and add both the initial output as well as my patch and
the resulting output. Don't forget to specify version of software you were
using.

Thanks,
--
Szilárd

On Thu, Nov 15, 2012 at 3:53 PM, Thomas Evangelidis wrote:

> Hi Szilárd,
>
> This is the warning message I get this time:
>
> WARNING: Oversubscribing the available -66 logical CPU cores with 1
> thread-MPI threads.
>
>  This will cause considerable performance loss!
>
> I have also attached the md.log file.
>
> thanks,
> Thomas
>
>
>
> On 14 November 2012 19:48, Szilárd Páll  wrote:
>
>> Hi Thomas,
>>
>> Could you please try applying the attached patch (git apply
>> hardware_detect.patch in the 4.6 source root) and let me know what the
>> output is?
>>
>> This should show which sysconf macro is used and what its return value is
>> as well as indicate if none of the macros are in fact defined by your
>> headers.
>>
>> Thanks,
>>
>> --
>> Szilárd
>>
>>
>>
>> On Sat, Nov 10, 2012 at 5:24 PM, Thomas Evangelidis wrote:
>>
>>>
>>>
>>> On 10 November 2012 03:21, Szilárd Páll  wrote:
>>>
 Hi,

 You must have an odd sysconf version! Could you please check what is
 the sysconf system variable's name in the sysconf man page (man sysconf)
 where it says something like:

 _SC_NPROCESSORS_ONLN
  The number of processors currently online.

 The first line should be one of the
 following: _SC_NPROCESSORS_ONLN, _SC_NPROC_ONLN,
 _SC_NPROCESSORS_CONF, _SC_NPROC_CONF, but I guess yours is something
 different.

>>>
>>> The following text is taken from man sysconf:
>>>
>>>These values also exist, but may not be standard.
>>>
>>> - _SC_PHYS_PAGES
>>>   The number of pages of physical memory.  Note that it is
>>> possible for the product of this value and the value of _SC_PAGE_SIZE to
>>> overflow.
>>>
>>> - _SC_AVPHYS_PAGES
>>>   The number of currently available pages of physical memory.
>>>
>>> - _SC_NPROCESSORS_CONF
>>>   The number of processors configured.
>>>
>>> - _SC_NPROCESSORS_ONLN
>>>   The number of processors currently online (available).
>>>
>>>
>>>
>>>
 Can you also check what your glibc version is?

>>>
>>> $ yum list installed | grep glibc
>>> glibc.i6862.15-57.fc17
>>> @updates
>>> glibc.x86_64  2.15-57.fc17
>>> @updates
>>> glibc-common.x86_64   2.15-57.fc17
>>> @updates
>>> glibc-devel.i686  2.15-57.fc17
>>> @updates
>>> glibc-devel.x86_642.15-57.fc17
>>> @updates
>>> glibc-headers.x86_64  2.15-57.fc17
>>> @updates
>>>
>>>
>>>


 On Fri, Nov 9, 2012 at 5:51 PM, Thomas Evangelidis 
 wrote:

>
>
>
> > I get these two warnings when I run the dhfr/GPU/dhfr-solv-PME.bench
>> > benchmark with the following command line:
>> >
>> > mdrun_intel_cuda5 -v -s topol.tpr -testverlet
>> >
>> > "WARNING: Oversubscribing the available 0 logical CPU cores with 1
>> > thread-MPI threads."
>> >
>> > 0 logical CPU cores? Isn't this bizarre? My CPU is Intel Core
>> i7-3610QM
>> >
>>
>> That is bizzarre. Could you run with "-debug 1" and have a look at the
>> mdrun.debug output which should contain a message like:
>> "Detected N processors, will use this as the number of supported
>> hardware
>> threads."
>>
>> I'm wondering, is N=0 in your case!?
>>
>> It says "Detected 0 processors, will use this as the number of
> supported hardware threads."
>
>
>>
>> > (2.3 GHz). Unlike Albert, I don't see any performance loss, I get
>> 13.4
>> > ns/day on a single core with 1 GPU and 13.2 ns/day with GROMACS
>> v4.5.5 on 4
>> > cores (8 threads) without the GPU. Yet, I don't see any performance
>> gain
>> > with more that 4 -nt threads.
>> >
>> > mdrun_intel_cuda5 -v -nt 2 -s topol.tpr -testverlet : 15.4 ns/day
>> > mdrun_intel_cuda5 -v -nt 3 -s topol.tpr -testverlet : 16.0 ns/day
>> > mdrun_intel_cuda5 -v -nt 4 -s topol.tpr -testverlet : 16.3 ns/day
>> > mdrun_intel_cuda5 -v -nt 6 -s topol.tpr -testverlet : 16.2 ns/day
>> > mdrun_intel_cuda5 -v -nt 8 -s topol.tpr -testverlet : 15.4 ns/day
>> >
>>
>> I guess there is not much point in not using all cores, is it? Note
>> that
>> the performance drops after 4 threads because Hyper-Threading with
>> OpenMP
>> doesn't always help.
>>
>>
>> >
>> > I have also attached my log file (from "mdrun_intel_cuda5 -v -s
>> topol.tpr
>> > -testverlet") in case you find it helpf

Re: [gmx-users] GPU warnings

2012-11-15 Thread Justin Lemkul



On 11/15/12 9:53 AM, Thomas Evangelidis wrote:

Hi Szilárd,

This is the warning message I get this time:

WARNING: Oversubscribing the available -66 logical CPU cores with 1
thread-MPI threads.
  This will cause considerable performance loss!

I have also attached the md.log file.



Attachments are rejected by the mailing list.  They either have to be copied and 
pasted, linked, or sent to an individual specifically off-list.


-Justin

--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-14 Thread Szilárd Páll
Hi Thomas,

Could you please try applying the attached patch (git apply
hardware_detect.patch in the 4.6 source root) and let me know what the
output is?

This should show which sysconf macro is used and what its return value is
as well as indicate if none of the macros are in fact defined by your
headers.

Thanks,

--
Szilárd


On Sat, Nov 10, 2012 at 5:24 PM, Thomas Evangelidis wrote:

>
>
> On 10 November 2012 03:21, Szilárd Páll  wrote:
>
>> Hi,
>>
>> You must have an odd sysconf version! Could you please check what is the
>> sysconf system variable's name in the sysconf man page (man sysconf) where
>> it says something like:
>>
>> _SC_NPROCESSORS_ONLN
>>  The number of processors currently online.
>>
>> The first line should be one of the
>> following: _SC_NPROCESSORS_ONLN, _SC_NPROC_ONLN,
>> _SC_NPROCESSORS_CONF, _SC_NPROC_CONF, but I guess yours is something
>> different.
>>
>
> The following text is taken from man sysconf:
>
>These values also exist, but may not be standard.
>
> - _SC_PHYS_PAGES
>   The number of pages of physical memory.  Note that it is
> possible for the product of this value and the value of _SC_PAGE_SIZE to
> overflow.
>
> - _SC_AVPHYS_PAGES
>   The number of currently available pages of physical memory.
>
> - _SC_NPROCESSORS_CONF
>   The number of processors configured.
>
> - _SC_NPROCESSORS_ONLN
>   The number of processors currently online (available).
>
>
>
>
>> Can you also check what your glibc version is?
>>
>
> $ yum list installed | grep glibc
> glibc.i6862.15-57.fc17
> @updates
> glibc.x86_64  2.15-57.fc17
> @updates
> glibc-common.x86_64   2.15-57.fc17
> @updates
> glibc-devel.i686  2.15-57.fc17
> @updates
> glibc-devel.x86_642.15-57.fc17
> @updates
> glibc-headers.x86_64  2.15-57.fc17   @updates
>
>
>
>>
>>
>> On Fri, Nov 9, 2012 at 5:51 PM, Thomas Evangelidis wrote:
>>
>>>
>>>
>>>
>>> > I get these two warnings when I run the dhfr/GPU/dhfr-solv-PME.bench
 > benchmark with the following command line:
 >
 > mdrun_intel_cuda5 -v -s topol.tpr -testverlet
 >
 > "WARNING: Oversubscribing the available 0 logical CPU cores with 1
 > thread-MPI threads."
 >
 > 0 logical CPU cores? Isn't this bizarre? My CPU is Intel Core
 i7-3610QM
 >

 That is bizzarre. Could you run with "-debug 1" and have a look at the
 mdrun.debug output which should contain a message like:
 "Detected N processors, will use this as the number of supported
 hardware
 threads."

 I'm wondering, is N=0 in your case!?

 It says "Detected 0 processors, will use this as the number of
>>> supported hardware threads."
>>>
>>>

 > (2.3 GHz). Unlike Albert, I don't see any performance loss, I get 13.4
 > ns/day on a single core with 1 GPU and 13.2 ns/day with GROMACS
 v4.5.5 on 4
 > cores (8 threads) without the GPU. Yet, I don't see any performance
 gain
 > with more that 4 -nt threads.
 >
 > mdrun_intel_cuda5 -v -nt 2 -s topol.tpr -testverlet : 15.4 ns/day
 > mdrun_intel_cuda5 -v -nt 3 -s topol.tpr -testverlet : 16.0 ns/day
 > mdrun_intel_cuda5 -v -nt 4 -s topol.tpr -testverlet : 16.3 ns/day
 > mdrun_intel_cuda5 -v -nt 6 -s topol.tpr -testverlet : 16.2 ns/day
 > mdrun_intel_cuda5 -v -nt 8 -s topol.tpr -testverlet : 15.4 ns/day
 >

 I guess there is not much point in not using all cores, is it? Note that
 the performance drops after 4 threads because Hyper-Threading with
 OpenMP
 doesn't always help.


 >
 > I have also attached my log file (from "mdrun_intel_cuda5 -v -s
 topol.tpr
 > -testverlet") in case you find it helpful.
 >

 I don't see it attached.



>>> I have attached both mdrun_intel_cuda5.debug and md.log files.  They
>>> will possibly be filtered by the mailing list but will be delivered to your
>>> email.
>>>
>>> thanksm
>>> Thomas
>>>
>>
>>
>
>
> --
>
> ==
>
> Thomas Evangelidis
>
> PhD student
> University of Athens
> Faculty of Pharmacy
> Department of Pharmaceutical Chemistry
> Panepistimioupoli-Zografou
> 157 71 Athens
> GREECE
>
> email: tev...@pharm.uoa.gr
>
>   teva...@gmail.com
>
>
> website: https://sites.google.com/site/thomasevangelidishomepage/
>
>
>
>
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] GPU warnings

2012-11-10 Thread Thomas Evangelidis
On 10 November 2012 03:21, Szilárd Páll  wrote:

> Hi,
>
> You must have an odd sysconf version! Could you please check what is the
> sysconf system variable's name in the sysconf man page (man sysconf) where
> it says something like:
>
> _SC_NPROCESSORS_ONLN
>  The number of processors currently online.
>
> The first line should be one of the
> following: _SC_NPROCESSORS_ONLN, _SC_NPROC_ONLN,
> _SC_NPROCESSORS_CONF, _SC_NPROC_CONF, but I guess yours is something
> different.
>

The following text is taken from man sysconf:

   These values also exist, but may not be standard.

- _SC_PHYS_PAGES
  The number of pages of physical memory.  Note that it is
possible for the product of this value and the value of _SC_PAGE_SIZE to
overflow.

- _SC_AVPHYS_PAGES
  The number of currently available pages of physical memory.

- _SC_NPROCESSORS_CONF
  The number of processors configured.

- _SC_NPROCESSORS_ONLN
  The number of processors currently online (available).




> Can you also check what your glibc version is?
>

$ yum list installed | grep glibc
glibc.i6862.15-57.fc17
@updates
glibc.x86_64  2.15-57.fc17
@updates
glibc-common.x86_64   2.15-57.fc17
@updates
glibc-devel.i686  2.15-57.fc17
@updates
glibc-devel.x86_642.15-57.fc17
@updates
glibc-headers.x86_64  2.15-57.fc17   @updates



>
>
> On Fri, Nov 9, 2012 at 5:51 PM, Thomas Evangelidis wrote:
>
>>
>>
>>
>> > I get these two warnings when I run the dhfr/GPU/dhfr-solv-PME.bench
>>> > benchmark with the following command line:
>>> >
>>> > mdrun_intel_cuda5 -v -s topol.tpr -testverlet
>>> >
>>> > "WARNING: Oversubscribing the available 0 logical CPU cores with 1
>>> > thread-MPI threads."
>>> >
>>> > 0 logical CPU cores? Isn't this bizarre? My CPU is Intel Core i7-3610QM
>>> >
>>>
>>> That is bizzarre. Could you run with "-debug 1" and have a look at the
>>> mdrun.debug output which should contain a message like:
>>> "Detected N processors, will use this as the number of supported hardware
>>> threads."
>>>
>>> I'm wondering, is N=0 in your case!?
>>>
>>> It says "Detected 0 processors, will use this as the number of supported
>> hardware threads."
>>
>>
>>>
>>> > (2.3 GHz). Unlike Albert, I don't see any performance loss, I get 13.4
>>> > ns/day on a single core with 1 GPU and 13.2 ns/day with GROMACS v4.5.5
>>> on 4
>>> > cores (8 threads) without the GPU. Yet, I don't see any performance
>>> gain
>>> > with more that 4 -nt threads.
>>> >
>>> > mdrun_intel_cuda5 -v -nt 2 -s topol.tpr -testverlet : 15.4 ns/day
>>> > mdrun_intel_cuda5 -v -nt 3 -s topol.tpr -testverlet : 16.0 ns/day
>>> > mdrun_intel_cuda5 -v -nt 4 -s topol.tpr -testverlet : 16.3 ns/day
>>> > mdrun_intel_cuda5 -v -nt 6 -s topol.tpr -testverlet : 16.2 ns/day
>>> > mdrun_intel_cuda5 -v -nt 8 -s topol.tpr -testverlet : 15.4 ns/day
>>> >
>>>
>>> I guess there is not much point in not using all cores, is it? Note that
>>> the performance drops after 4 threads because Hyper-Threading with OpenMP
>>> doesn't always help.
>>>
>>>
>>> >
>>> > I have also attached my log file (from "mdrun_intel_cuda5 -v -s
>>> topol.tpr
>>> > -testverlet") in case you find it helpful.
>>> >
>>>
>>> I don't see it attached.
>>>
>>>
>>>
>> I have attached both mdrun_intel_cuda5.debug and md.log files.  They will
>> possibly be filtered by the mailing list but will be delivered to your
>> email.
>>
>> thanksm
>> Thomas
>>
>
>


-- 

==

Thomas Evangelidis

PhD student
University of Athens
Faculty of Pharmacy
Department of Pharmaceutical Chemistry
Panepistimioupoli-Zografou
157 71 Athens
GREECE

email: tev...@pharm.uoa.gr

  teva...@gmail.com


website: https://sites.google.com/site/thomasevangelidishomepage/
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-09 Thread Szilárd Páll
Hi,

You must have an odd sysconf version! Could you please check what is the
sysconf system variable's name in the sysconf man page (man sysconf) where
it says something like:

_SC_NPROCESSORS_ONLN
 The number of processors currently online.

The first line should be one of the
following: _SC_NPROCESSORS_ONLN, _SC_NPROC_ONLN,
_SC_NPROCESSORS_CONF, _SC_NPROC_CONF, but I guess yours is something
different.

Can you also check what your glibc version is?

Thanks,

--
Szilárd


On Fri, Nov 9, 2012 at 5:51 PM, Thomas Evangelidis wrote:

>
>
>
> > I get these two warnings when I run the dhfr/GPU/dhfr-solv-PME.bench
>> > benchmark with the following command line:
>> >
>> > mdrun_intel_cuda5 -v -s topol.tpr -testverlet
>> >
>> > "WARNING: Oversubscribing the available 0 logical CPU cores with 1
>> > thread-MPI threads."
>> >
>> > 0 logical CPU cores? Isn't this bizarre? My CPU is Intel Core i7-3610QM
>> >
>>
>> That is bizzarre. Could you run with "-debug 1" and have a look at the
>> mdrun.debug output which should contain a message like:
>> "Detected N processors, will use this as the number of supported hardware
>> threads."
>>
>> I'm wondering, is N=0 in your case!?
>>
>> It says "Detected 0 processors, will use this as the number of supported
> hardware threads."
>
>
>>
>> > (2.3 GHz). Unlike Albert, I don't see any performance loss, I get 13.4
>> > ns/day on a single core with 1 GPU and 13.2 ns/day with GROMACS v4.5.5
>> on 4
>> > cores (8 threads) without the GPU. Yet, I don't see any performance gain
>> > with more that 4 -nt threads.
>> >
>> > mdrun_intel_cuda5 -v -nt 2 -s topol.tpr -testverlet : 15.4 ns/day
>> > mdrun_intel_cuda5 -v -nt 3 -s topol.tpr -testverlet : 16.0 ns/day
>> > mdrun_intel_cuda5 -v -nt 4 -s topol.tpr -testverlet : 16.3 ns/day
>> > mdrun_intel_cuda5 -v -nt 6 -s topol.tpr -testverlet : 16.2 ns/day
>> > mdrun_intel_cuda5 -v -nt 8 -s topol.tpr -testverlet : 15.4 ns/day
>> >
>>
>> I guess there is not much point in not using all cores, is it? Note that
>> the performance drops after 4 threads because Hyper-Threading with OpenMP
>> doesn't always help.
>>
>>
>> >
>> > I have also attached my log file (from "mdrun_intel_cuda5 -v -s
>> topol.tpr
>> > -testverlet") in case you find it helpful.
>> >
>>
>> I don't see it attached.
>>
>>
>>
> I have attached both mdrun_intel_cuda5.debug and md.log files.  They will
> possibly be filtered by the mailing list but will be delivered to your
> email.
>
> thanksm
> Thomas
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-09 Thread Szilárd Páll
Hi,

On Tue, Nov 6, 2012 at 12:03 AM, Thomas Evangelidis wrote:

> Hi,
>
> I get these two warnings when I run the dhfr/GPU/dhfr-solv-PME.bench
> benchmark with the following command line:
>
> mdrun_intel_cuda5 -v -s topol.tpr -testverlet
>
> "WARNING: Oversubscribing the available 0 logical CPU cores with 1
> thread-MPI threads."
>
> 0 logical CPU cores? Isn't this bizarre? My CPU is Intel Core i7-3610QM
>

That is bizzarre. Could you run with "-debug 1" and have a look at the
mdrun.debug output which should contain a message like:
"Detected N processors, will use this as the number of supported hardware
threads."

I'm wondering, is N=0 in your case!?


> (2.3 GHz). Unlike Albert, I don't see any performance loss, I get 13.4
> ns/day on a single core with 1 GPU and 13.2 ns/day with GROMACS v4.5.5 on 4
> cores (8 threads) without the GPU. Yet, I don't see any performance gain
> with more that 4 -nt threads.
>
> mdrun_intel_cuda5 -v -nt 2 -s topol.tpr -testverlet : 15.4 ns/day
> mdrun_intel_cuda5 -v -nt 3 -s topol.tpr -testverlet : 16.0 ns/day
> mdrun_intel_cuda5 -v -nt 4 -s topol.tpr -testverlet : 16.3 ns/day
> mdrun_intel_cuda5 -v -nt 6 -s topol.tpr -testverlet : 16.2 ns/day
> mdrun_intel_cuda5 -v -nt 8 -s topol.tpr -testverlet : 15.4 ns/day
>

I guess there is not much point in not using all cores, is it? Note that
the performance drops after 4 threads because Hyper-Threading with OpenMP
doesn't always help.


>
> I have also attached my log file (from "mdrun_intel_cuda5 -v -s topol.tpr
> -testverlet") in case you find it helpful.
>

I don't see it attached.

--
Szilárd


>
> Thanks,
> Thomas
>
>
>
> On 5 November 2012 18:54, Szilárd Páll  wrote:
>
> > The first warning indicates that you are starting more threads than the
> > hardware supports which would explain the poor performance.
> >
> > Could share a log file of the suspiciously slow run as well as the
> command
> > line you used to start mdrun?
> >
> > Cheers,
> >
> > --
> > Szilárd
> >
> >
> > On Sun, Nov 4, 2012 at 5:32 PM, Albert  wrote:
> >
> > > well, IC.
> > > the performance is rather poor than GTX590. 32ns/day vs 4 ns/day
> > > probably that's also something related to the warnings?
> > >
> > > THX
> > >
> > >
> > >
> > > On 11/04/2012 01:59 PM, Justin Lemkul wrote:
> > >
> > >>
> > >>
> > >> On 11/4/12 4:55 AM, Albert wrote:
> > >>
> > >>> hello:
> > >>>
> > >>>   I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti
> (2
> > x
> > >>> 1344
> > >>> CUDA cores), and I got the following warnings:
> > >>>
> > >>> thank you very much.
> > >>>
> > >>> ---**messages--**
> > >>> -
> > >>>
> > >>> WARNING: On node 0: oversubscribing the available 0 logical CPU cores
> > >>> per node
> > >>> with 2 MPI processes.
> > >>>   This will cause considerable performance loss!
> > >>>
> > >>> 2 GPUs detected on host boreas:
> > >>>#0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> > >>> compatible
> > >>>#1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> > >>> compatible
> > >>>
> > >>> 2 GPUs auto-selected to be used for this run: #0, #1
> > >>>
> > >>> Using CUDA 8x8x8 non-bonded kernels
> > >>> Making 1D domain decomposition 1 x 2 x 1
> > >>>
> > >>> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
> > >>> We have just committed the new CPU detection code in this branch,
> > >>> and will commit new SSE/AVX kernels in a few days. However, this
> > >>> means that currently only the NxN kernels are accelerated!
> > >>> In the mean time, you might want to avoid production runs in 4.6.
> > >>>
> > >>>
> > >> I can't address the first warning, but the second is fairly obvious.
> > >>  You're not using an official release, you're using the development
> > version
> > >> - let the user beware.  The code is not yet production-ready.
> > >>
> > >> -Justin
> > >>
> > >>
> > > --
> > > gmx-users mailing listgmx-users@gromacs.org
> > > http://lists.gromacs.org/**mailman/listinfo/gmx-users<
> > http://lists.gromacs.org/mailman/listinfo/gmx-users>
> > > * Please search the archive at http://www.gromacs.org/**
> > > Support/Mailing_Lists/Search<
> > http://www.gromacs.org/Support/Mailing_Lists/Search>before posting!
> > > * Please don't post (un)subscribe requests to the list. Use the www
> > > interface or send it to gmx-users-requ...@gromacs.org.
> > > * Can't post? Read http://www.gromacs.org/**Support/Mailing_Lists<
> > http://www.gromacs.org/Support/Mailing_Lists>
> > >
> > --
> > gmx-users mailing listgmx-users@gromacs.org
> > http://lists.gromacs.org/mailman/listinfo/gmx-users
> > * Please search the archive at
> > http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> > * Please don't post (un)subscribe requests to the list. Use the
> > www interface or send it to gmx-users-requ...@gromacs.org.
> > * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
> >
>
>
>
> --
>
> 

Re: [gmx-users] GPU warnings

2012-11-05 Thread Thomas Evangelidis
Hi,

I get these two warnings when I run the dhfr/GPU/dhfr-solv-PME.bench
benchmark with the following command line:

mdrun_intel_cuda5 -v -s topol.tpr -testverlet

"WARNING: Oversubscribing the available 0 logical CPU cores with 1
thread-MPI threads."

0 logical CPU cores? Isn't this bizarre? My CPU is Intel Core i7-3610QM
(2.3 GHz). Unlike Albert, I don't see any performance loss, I get 13.4
ns/day on a single core with 1 GPU and 13.2 ns/day with GROMACS v4.5.5 on 4
cores (8 threads) without the GPU. Yet, I don't see any performance gain
with more that 4 -nt threads.

mdrun_intel_cuda5 -v -nt 2 -s topol.tpr -testverlet : 15.4 ns/day
mdrun_intel_cuda5 -v -nt 3 -s topol.tpr -testverlet : 16.0 ns/day
mdrun_intel_cuda5 -v -nt 4 -s topol.tpr -testverlet : 16.3 ns/day
mdrun_intel_cuda5 -v -nt 6 -s topol.tpr -testverlet : 16.2 ns/day
mdrun_intel_cuda5 -v -nt 8 -s topol.tpr -testverlet : 15.4 ns/day

I have also attached my log file (from "mdrun_intel_cuda5 -v -s topol.tpr
-testverlet") in case you find it helpful.

Thanks,
Thomas



On 5 November 2012 18:54, Szilárd Páll  wrote:

> The first warning indicates that you are starting more threads than the
> hardware supports which would explain the poor performance.
>
> Could share a log file of the suspiciously slow run as well as the command
> line you used to start mdrun?
>
> Cheers,
>
> --
> Szilárd
>
>
> On Sun, Nov 4, 2012 at 5:32 PM, Albert  wrote:
>
> > well, IC.
> > the performance is rather poor than GTX590. 32ns/day vs 4 ns/day
> > probably that's also something related to the warnings?
> >
> > THX
> >
> >
> >
> > On 11/04/2012 01:59 PM, Justin Lemkul wrote:
> >
> >>
> >>
> >> On 11/4/12 4:55 AM, Albert wrote:
> >>
> >>> hello:
> >>>
> >>>   I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti (2
> x
> >>> 1344
> >>> CUDA cores), and I got the following warnings:
> >>>
> >>> thank you very much.
> >>>
> >>> ---**messages--**
> >>> -
> >>>
> >>> WARNING: On node 0: oversubscribing the available 0 logical CPU cores
> >>> per node
> >>> with 2 MPI processes.
> >>>   This will cause considerable performance loss!
> >>>
> >>> 2 GPUs detected on host boreas:
> >>>#0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> >>> compatible
> >>>#1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
> >>> compatible
> >>>
> >>> 2 GPUs auto-selected to be used for this run: #0, #1
> >>>
> >>> Using CUDA 8x8x8 non-bonded kernels
> >>> Making 1D domain decomposition 1 x 2 x 1
> >>>
> >>> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
> >>> We have just committed the new CPU detection code in this branch,
> >>> and will commit new SSE/AVX kernels in a few days. However, this
> >>> means that currently only the NxN kernels are accelerated!
> >>> In the mean time, you might want to avoid production runs in 4.6.
> >>>
> >>>
> >> I can't address the first warning, but the second is fairly obvious.
> >>  You're not using an official release, you're using the development
> version
> >> - let the user beware.  The code is not yet production-ready.
> >>
> >> -Justin
> >>
> >>
> > --
> > gmx-users mailing listgmx-users@gromacs.org
> > http://lists.gromacs.org/**mailman/listinfo/gmx-users<
> http://lists.gromacs.org/mailman/listinfo/gmx-users>
> > * Please search the archive at http://www.gromacs.org/**
> > Support/Mailing_Lists/Search<
> http://www.gromacs.org/Support/Mailing_Lists/Search>before posting!
> > * Please don't post (un)subscribe requests to the list. Use the www
> > interface or send it to gmx-users-requ...@gromacs.org.
> > * Can't post? Read http://www.gromacs.org/**Support/Mailing_Lists<
> http://www.gromacs.org/Support/Mailing_Lists>
> >
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> * Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> * Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>



-- 

==

Thomas Evangelidis

PhD student
University of Athens
Faculty of Pharmacy
Department of Pharmaceutical Chemistry
Panepistimioupoli-Zografou
157 71 Athens
GREECE

email: tev...@pharm.uoa.gr

  teva...@gmail.com


website: https://sites.google.com/site/thomasevangelidishomepage/
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] GPU warnings

2012-11-05 Thread Szilárd Páll
The first warning indicates that you are starting more threads than the
hardware supports which would explain the poor performance.

Could share a log file of the suspiciously slow run as well as the command
line you used to start mdrun?

Cheers,

--
Szilárd


On Sun, Nov 4, 2012 at 5:32 PM, Albert  wrote:

> well, IC.
> the performance is rather poor than GTX590. 32ns/day vs 4 ns/day
> probably that's also something related to the warnings?
>
> THX
>
>
>
> On 11/04/2012 01:59 PM, Justin Lemkul wrote:
>
>>
>>
>> On 11/4/12 4:55 AM, Albert wrote:
>>
>>> hello:
>>>
>>>   I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti (2 x
>>> 1344
>>> CUDA cores), and I got the following warnings:
>>>
>>> thank you very much.
>>>
>>> ---**messages--**
>>> -
>>>
>>> WARNING: On node 0: oversubscribing the available 0 logical CPU cores
>>> per node
>>> with 2 MPI processes.
>>>   This will cause considerable performance loss!
>>>
>>> 2 GPUs detected on host boreas:
>>>#0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
>>> compatible
>>>#1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
>>> compatible
>>>
>>> 2 GPUs auto-selected to be used for this run: #0, #1
>>>
>>> Using CUDA 8x8x8 non-bonded kernels
>>> Making 1D domain decomposition 1 x 2 x 1
>>>
>>> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
>>> We have just committed the new CPU detection code in this branch,
>>> and will commit new SSE/AVX kernels in a few days. However, this
>>> means that currently only the NxN kernels are accelerated!
>>> In the mean time, you might want to avoid production runs in 4.6.
>>>
>>>
>> I can't address the first warning, but the second is fairly obvious.
>>  You're not using an official release, you're using the development version
>> - let the user beware.  The code is not yet production-ready.
>>
>> -Justin
>>
>>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/**mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Searchbefore
>  posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read 
> http://www.gromacs.org/**Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-04 Thread Albert

well, IC.
the performance is rather poor than GTX590. 32ns/day vs 4 ns/day
probably that's also something related to the warnings?

THX


On 11/04/2012 01:59 PM, Justin Lemkul wrote:



On 11/4/12 4:55 AM, Albert wrote:

hello:

  I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti 
(2 x 1344

CUDA cores), and I got the following warnings:

thank you very much.

---messages---

WARNING: On node 0: oversubscribing the available 0 logical CPU cores 
per node

with 2 MPI processes.
  This will cause considerable performance loss!

2 GPUs detected on host boreas:
   #0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat: 
compatible
   #1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat: 
compatible


2 GPUs auto-selected to be used for this run: #0, #1

Using CUDA 8x8x8 non-bonded kernels
Making 1D domain decomposition 1 x 2 x 1

* WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
We have just committed the new CPU detection code in this branch,
and will commit new SSE/AVX kernels in a few days. However, this
means that currently only the NxN kernels are accelerated!
In the mean time, you might want to avoid production runs in 4.6.



I can't address the first warning, but the second is fairly obvious.  
You're not using an official release, you're using the development 
version - let the user beware.  The code is not yet production-ready.


-Justin



--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-04 Thread Thomas Evangelidis
I 'm also get the first warning ("oversubscribing the available...") and
see no obvious performance gain. Do you know how to avoid that?

thanks,
Thomas



On 4 November 2012 14:59, Justin Lemkul  wrote:

>
>
> On 11/4/12 4:55 AM, Albert wrote:
>
>> hello:
>>
>>   I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti (2 x
>> 1344
>> CUDA cores), and I got the following warnings:
>>
>> thank you very much.
>>
>> ---**messages--**
>> -
>>
>> WARNING: On node 0: oversubscribing the available 0 logical CPU cores per
>> node
>> with 2 MPI processes.
>>   This will cause considerable performance loss!
>>
>> 2 GPUs detected on host boreas:
>>#0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
>> compatible
>>#1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat:
>> compatible
>>
>> 2 GPUs auto-selected to be used for this run: #0, #1
>>
>> Using CUDA 8x8x8 non-bonded kernels
>> Making 1D domain decomposition 1 x 2 x 1
>>
>> * WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
>> We have just committed the new CPU detection code in this branch,
>> and will commit new SSE/AVX kernels in a few days. However, this
>> means that currently only the NxN kernels are accelerated!
>> In the mean time, you might want to avoid production runs in 4.6.
>>
>>
> I can't address the first warning, but the second is fairly obvious.
>  You're not using an official release, you're using the development version
> - let the user beware.  The code is not yet production-ready.
>
> -Justin
>
> --
> ==**==
>
> Justin A. Lemkul, Ph.D.
> Research Scientist
> Department of Biochemistry
> Virginia Tech
> Blacksburg, VA
> jalemkul[at]vt.edu | (540) 231-9080
> http://www.bevanlab.biochem.**vt.edu/Pages/Personal/justin
>
> ==**==
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/**mailman/listinfo/gmx-users
> * Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Searchbefore
>  posting!
> * Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> * Can't post? Read 
> http://www.gromacs.org/**Support/Mailing_Lists
>



-- 

==

Thomas Evangelidis

PhD student
University of Athens
Faculty of Pharmacy
Department of Pharmaceutical Chemistry
Panepistimioupoli-Zografou
157 71 Athens
GREECE

email: tev...@pharm.uoa.gr

  teva...@gmail.com


website: https://sites.google.com/site/thomasevangelidishomepage/
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU warnings

2012-11-04 Thread Justin Lemkul



On 11/4/12 4:55 AM, Albert wrote:

hello:

  I am running Gromacs 4.6 GPU on a workstation with two GTX 660 Ti (2 x 1344
CUDA cores), and I got the following warnings:

thank you very much.

---messages---

WARNING: On node 0: oversubscribing the available 0 logical CPU cores per node
with 2 MPI processes.
  This will cause considerable performance loss!

2 GPUs detected on host boreas:
   #0: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat: compatible
   #1: NVIDIA GeForce GTX 660 Ti, compute cap.: 3.0, ECC:  no, stat: compatible

2 GPUs auto-selected to be used for this run: #0, #1

Using CUDA 8x8x8 non-bonded kernels
Making 1D domain decomposition 1 x 2 x 1

* WARNING * WARNING * WARNING * WARNING * WARNING * WARNING *
We have just committed the new CPU detection code in this branch,
and will commit new SSE/AVX kernels in a few days. However, this
means that currently only the NxN kernels are accelerated!
In the mean time, you might want to avoid production runs in 4.6.



I can't address the first warning, but the second is fairly obvious.  You're not 
using an official release, you're using the development version - let the user 
beware.  The code is not yet production-ready.


-Justin

--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU-C2075-simulation-solw or GPU only running -reg

2012-10-21 Thread Justin Lemkul



On 10/21/12 3:38 PM, venkatesh s wrote:

Respected Gromacs people's,
 my query is my system very
slow? how can i improve the speed, its running like or equal to (25
minutes) "Intel Core I 7 processors" only.
Here i am given my entire system information,and  i found my system 8 core
not taking job (GPU only running).



mdrun-gpu -device
"OpenMM:platform=Cuda,memtest=15,deviceid=0,force-device=yes" -v -deffnm nvt


Non-supported GPU selected (#0, Tesla C2075), forced continuing.Note, that
the simulation can be slow or it migth even crash.
Pre-simulation ~15s memtest in progress...
Memory test completed without errors.

Back Off! I just backed up nvt.log to ./#nvt.log.1#
Getting Loaded...
Reading file nvt.tpr, VERSION 4.5.5 (single precision)
Loaded with Money


Back Off! I just backed up nvt.trr to ./#nvt.trr.1#

Back Off! I just backed up nvt.edr to ./#nvt.edr.1#

WARNING: OpenMM supports only Andersen thermostat with the
md/md-vv/md-vv-avek integrators.


WARNING: OpenMM provides contraints as a combination of SHAKE, SETTLE and
CCMA. Accuracy is based on the SHAKE tolerance set by the "shake_tol"
option.


WARNING: Non-supported GPU selected (#0, Tesla C2075), forced
continuing.Note, that the simulation can be slow or it migth even crash.

Pre-simulation ~15s memtest in progress...done, no errors detected
starting mdrun 'Protein in water'
5 steps,100.0 ps.

OpenMM run - timing based on wallclock.

NODE (s)   Real (s)  (%)
Time:   1319.043   1319.043100.0
21:59
(Mnbf/s)   (MFlops)   (ns/day)  (hour/ns)
Performance:  0.000  0.006  6.550  3.664

NVIDIA-SMI -l
+--+

| NVIDIA-SMI 3.295.59   Driver Version: 295.59
|
|---+--+--+
| Nb.  Name | Bus IdDisp.  | Volatile ECC SB /
DB |
| Fan   Temp   Power Usage /Cap | Memory Usage | GPU Util. Compute
M. |
|===+==+==|
| 0.  Tesla C2075   | :01:00.0  On | 0
0 |
|  30%   75 C  P0   150W / 225W |   8%  435MB / 5375MB |   95%
Default|
|---+--+--|
| Compute processes:   GPU
Memory |
|  GPU  PID Process name
Usage  |
|=|
|  0.  5889 mdrun-gpu
372MB  |
+-+


system:

top

top - 22:48:22 up 13 min,  4 users,  load average: 0.19, 0.18, 0.09
Tasks: 308 total,   2 running, 304 sleeping,   2 stopped,   0 zombie
Cpu0  : 16.4%us,  1.7%sy,  0.0%ni, 81.9%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu1  :  5.4%us,  0.7%sy,  0.0%ni, 94.0%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu2  :  9.3%us,  0.7%sy,  0.0%ni, 90.0%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu3  :  0.0%us,  0.7%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu4  : 13.0%us,  0.7%sy,  0.0%ni, 86.4%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu5  :  1.0%us,  0.0%sy,  0.0%ni, 99.0%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu6  :  0.3%us,  0.3%sy,  0.0%ni, 99.3%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,
0.0%st
Mem:  12188656k total,  1191628k used, 10997028k free,34804k buffers
Swap:0k total,0k used,0k free,   418428k cached



system?
protein  +sol   +  NA   total atom(nvt.gro)
158 residues   10742234646



npt.mdp file

; Run parameters
integrator= md-vv;
nsteps= 5; 2 * 5 = 100 ps
dt= 0.002; 2 fs
; Output control
nstxout= 100; save coordinates every 0.2 ps
nstvout= 100; save velocities every 0.2 ps
nstenergy= 100; save energies every 0.2 ps
nstlog= 100; update log file every 0.2 ps
; Bond parameters
continuation= yes; Restarting after NVT
constraint_algorithm = lincs; holonomic constraints
constraints= all-bonds; all bonds (even heavy atom-H bonds)
constrained
lincs_iter= 1; accuracy of LINCS
lincs_order= 4; also related to accuracy
; Neighborsearching
ns_type= grid; search neighboring grid cells
nstlist= 5; 10 fs
rlist= 1.0; short-range neighborlist cutoff (in nm)
rcoulomb= 1.0; short-range electrostatic cutoff (in nm)
rvdw= 1.0; short-range van der Waals cutoff (in nm)
; Electrostatics
coulombtype= PME; Particle Mesh Ewald for long-range
electrostatics
pme_order= 4; cubic interpolation
fourierspacing= 0.16; grid spacing for FFT
; Temperature coupling is on

Re: [gmx-users] GPU-C2075-simulation-solw -reg

2012-10-20 Thread Justin Lemkul



On 10/20/12 1:34 PM, venkatesh s wrote:

Respected Gromacs Users
  i started the energy simulation but
its slow (showing following )

Getting Loaded...
Reading file em.tpr, VERSION 4.5.5 (single precision)
Loaded with Money

WARNING: Non-supported GPU selected (#0, Tesla C2075), forced
continuing.Note, that the simulation can be slow or it might even crash.

Pre-simulation ~15s memtest in progress...done, no errors detected
starting mdrun 'Protein in water'
5 steps, 50.0 ps.

for increase the speed of gpu what i want to do ?
kindly provide the promote solution


No one can suggest a solution without a better statement of the problem.  What 
is your system?  How many atoms does it have?  How fast is it running? What is 
your .mdp file?  How do the benchmark systems perform?


-Justin

--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU -simulation error -reg

2012-10-14 Thread Justin Lemkul



On 10/14/12 8:01 AM, venkatesh s wrote:

Respected Gromacs People's,
  system Containing protein+peptide  ( Normally i use the
lysosome tutorial md.mdp (only i change the nanosecond) )

  mdrun-gpu -v -deffnm  md_0_1

while running this i got fatal error like this (Following)

--
Getting Loaded...
Reading file md_0_1.tpr, VERSION 4.5.5 (single precision)
Loaded with Money


WARNING: OpenMM does not support leap-frog, will use velocity-verlet
integrator.


WARNING: OpenMM supports only Andersen thermostat with the
md/md-vv/md-vv-avek integrators.


---
Program mdrun-gpu, VERSION 4.5.5
Source code file:
/opt/softwares/compile/gromacs-4.5.5/src/kernel/openmm_wrapper.cpp, line:
580

Fatal error:
OpenMM does not support multiple temperature coupling groups.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
---


Kindly provide prompt answer


The error message is fairly self-explanatory.  You are using multiple 
temperature coupling groups (tc-grps in the .mdp file).  You can't do that when 
running on GPU.  Set tc-grps = System.


-Justin

--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
* Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU

2012-06-13 Thread Szilárd Páll
On Wed, Jun 13, 2012 at 3:59 AM, Mark Abraham wrote:

> On 12/06/2012 10:49 PM, Ehud Schreiber wrote:
>
>> Message: 4
>>> Date: Mon, 11 Jun 2012 15:54:39 +1000
>>> From: Mark Abraham>
>>> Subject: Re: [gmx-users] GPU
>>> To: Discussion list for GROMACS users
>>> Message-ID:<4FD5881F.3040509@**anu.edu.au <4fd5881f.3040...@anu.edu.au>>
>>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>>>
>>> On 11/06/2012 2:32 AM, ifat shub wrote:
>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> If I understand correctly, currently the Gromacs GPU acceleration does
>>>> not support energy minimization. Is this so? Are there any plans to
>>>> include it in the 4.6 version or in a later one (i.e. to allow, say,
>>>> integrator = steep or cg in mdrun-gpu)? I would find such options
>>>> extremely useful.
>>>>
>>> EM is normally so quick that it's not worth putting much effort into
>>> accelerating it, compared to the CPU-months that are spent doing
>>> subsequent MD.
>>>
>>> Mark
>>>
>> Currently, my main use of Gromacs entails running multiple minimizations
>> on an ensemble of states.
>> Moreover, these states are not obtained using molecular dynamics but
>> rather using the Concoord algorithm.
>> Therefore, for me the bottleneck is not md but rather minimizations
>> (specifically, cg) and so their acceleration on GPUs would be very
>> advantageous.
>> If such usage is not totally idiosyncratic, I hope the development team
>> would reconsider GPU accelerating also minimizations.
>> I suspect this would not be technically too complex given the work
>> already done on dynamics.
>>
>
> I suspect the upcoming 4.6 release will have GPU-accelerated EM available
> as a side effect of the new Verlet pair-list scheme for computing
> non-bonded interactions. This development is unrelated to previous GPU
> efforts, I


It does work and has been tested extensively. We are working on the final
details,  but you can get the code from the nbnxn_hybrid_acc branch -- it's
pretty safe to use it for non-production purposes!

The pages Mark linked are the resources you want to start with before you
start using the NxN kernels.

Cheers,
--
Szilárd


> understand. See http://www.gromacs.org/**Documentation/Acceleration_**
> and_parallelization<http://www.gromacs.org/Documentation/Acceleration_and_parallelization>and
> http://www.gromacs.org/**Documentation/Cut-off_schemes<http://www.gromacs.org/Documentation/Cut-off_schemes>for
>  some advance details. When you hear a call for alpha testers in the
> next few

months, you might want to spend some time on that so that you're sure
> GROMACS will best meet your future needs. :-)



>
>
> Mark
>
> --
> gmx-users mailing listgmx-users@gromacs.org
> http://lists.gromacs.org/**mailman/listinfo/gmx-users<http://lists.gromacs.org/mailman/listinfo/gmx-users>
> Please search the archive at http://www.gromacs.org/**
> Support/Mailing_Lists/Search<http://www.gromacs.org/Support/Mailing_Lists/Search>before
>  posting!
> Please don't post (un)subscribe requests to the list. Use the www
> interface or send it to gmx-users-requ...@gromacs.org.
> Can't post? Read 
> http://www.gromacs.org/**Support/Mailing_Lists<http://www.gromacs.org/Support/Mailing_Lists>
>
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] GPU

2012-06-12 Thread Mark Abraham

On 12/06/2012 10:49 PM, Ehud Schreiber wrote:

Message: 4
Date: Mon, 11 Jun 2012 15:54:39 +1000
From: Mark Abraham
Subject: Re: [gmx-users] GPU
To: Discussion list for GROMACS users
Message-ID:<4fd5881f.3040...@anu.edu.au>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed

On 11/06/2012 2:32 AM, ifat shub wrote:

Hi,



If I understand correctly, currently the Gromacs GPU acceleration does
not support energy minimization. Is this so? Are there any plans to
include it in the 4.6 version or in a later one (i.e. to allow, say,
integrator = steep or cg in mdrun-gpu)? I would find such options
extremely useful.

EM is normally so quick that it's not worth putting much effort into
accelerating it, compared to the CPU-months that are spent doing
subsequent MD.

Mark

Currently, my main use of Gromacs entails running multiple minimizations on an 
ensemble of states.
Moreover, these states are not obtained using molecular dynamics but rather 
using the Concoord algorithm.
Therefore, for me the bottleneck is not md but rather minimizations 
(specifically, cg) and so their acceleration on GPUs would be very advantageous.
If such usage is not totally idiosyncratic, I hope the development team would 
reconsider GPU accelerating also minimizations.
I suspect this would not be technically too complex given the work already done 
on dynamics.


I suspect the upcoming 4.6 release will have GPU-accelerated EM 
available as a side effect of the new Verlet pair-list scheme for 
computing non-bonded interactions. This development is unrelated to 
previous GPU efforts, I understand. See 
http://www.gromacs.org/Documentation/Acceleration_and_parallelization 
and http://www.gromacs.org/Documentation/Cut-off_schemes for some 
advance details. When you hear a call for alpha testers in the next few 
months, you might want to spend some time on that so that you're sure 
GROMACS will best meet your future needs. :-)


Mark
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU

2012-06-12 Thread Ehud Schreiber
>Message: 4
>Date: Mon, 11 Jun 2012 15:54:39 +1000
>From: Mark Abraham 
>Subject: Re: [gmx-users] GPU
>To: Discussion list for GROMACS users 
>Message-ID: <4fd5881f.3040...@anu.edu.au>
>Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
>On 11/06/2012 2:32 AM, ifat shub wrote:
>> Hi,
>>
>>
>>
>> If I understand correctly, currently the Gromacs GPU acceleration does
>> not support energy minimization. Is this so? Are there any plans to
>> include it in the 4.6 version or in a later one (i.e. to allow, say,
>> integrator = steep or cg in mdrun-gpu)? I would find such options
>> extremely useful.

>EM is normally so quick that it's not worth putting much effort into 
>accelerating it, compared to the CPU-months that are spent doing 
>subsequent MD.
>
>Mark

Currently, my main use of Gromacs entails running multiple minimizations on an 
ensemble of states.
Moreover, these states are not obtained using molecular dynamics but rather 
using the Concoord algorithm.
Therefore, for me the bottleneck is not md but rather minimizations 
(specifically, cg) and so their acceleration on GPUs would be very advantageous.
If such usage is not totally idiosyncratic, I hope the development team would 
reconsider GPU accelerating also minimizations.
I suspect this would not be technically too complex given the work already done 
on dynamics.

Thanks,
Ehud Schreiber.

--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU

2012-06-10 Thread Mark Abraham

On 11/06/2012 2:32 AM, ifat shub wrote:

Hi,



If I understand correctly, currently the Gromacs GPU acceleration does
not support energy minimization. Is this so? Are there any plans to
include it in the 4.6 version or in a later one (i.e. to allow, say,
integrator = steep or cg in mdrun-gpu)? I would find such options
extremely useful.


EM is normally so quick that it's not worth putting much effort into 
accelerating it, compared to the CPU-months that are spent doing 
subsequent MD.


Mark
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU crashes

2012-06-07 Thread Justin A. Lemkul



On 6/7/12 3:57 AM, lloyd riggs wrote:

Did you play with the time step?  Just currious, but I woundered what
happened with 0.0008, 0.0005, 0.0002.  I found if I had a good behaving
protein, as soon as I added a small (non-protein) molecule which rotated
wildly while attached to the protein, it would crash unless I reduced the
time step to the above when constraints were removed after EQ ... always it
seemed to me it didnt like the rotation or bond angles, seeing them as a
violation but acted like it was an amino acid? (the same bond type but with
wider rotation as one end wasnt fixed to a chain)  If your loop moves via
backbone, the calculated angles, bonds or whatever might appear to the
computer to be violating the parameter settings for problems, errors, etc as
it cant track them fast enough over the time step. Ie atom 1-2-3 and then
delta 1-2-3 with xyz parameters, but then the particular set has additional
rotation, etc and may include the chain atoms which bend wildly (n-Ca-Cb-Cg
maybe a dihedral) but proba! bly not this.

Just a thought but probably not the right answere as well, it might be the
way it is broken down (above) over GPUs, which convert everything to
matricies (non-standard just for basic math operations not real matricies per
say) for exicution and then some library problem which would not account for
long range rapid (0.0005) movements at the chain (Ca,N,O to something else)
and then tries to apply these to Cb-Cg-O-H, etc using the initial points
while looking at the parameters for say a single amino acid...Maybe the
constraints would cause this, which would make it a pain to EQ, but this
allowed me to increase the time step, but would ruin the experiment I had
worked on as I needed it unconstrained to show it didnt float away when
proteins were pulled, etc...I was using a different integrator though...just
normal MD.



I have long wondered if constraints were properly handled by the OpenMM library. 
 I am constraining all bonds, so in principle, dt of 0.002 should not be a 
problem.  The note printed indicates that the constraint algorithm is changed 
from the one selected (LINCS) to whatever OpenMM uses (SHAKE and a few others in 
combination).  Perhaps I can try running without constraints and a reduced dt, 
but I'd like to avoid it.


I wish I could efficiently test to see if this behavior was GPU-specific, but 
unfortunately the non-GPU implementation of the implicit code can currently only 
be run in serial or on 2 CPU due to an existing bug.  I can certainly test it, 
but due to the large number of atoms, it will take several days to even approach 
1 ns.



ANd your cutoffs for vdw, etc...Why are they 0?  I dont know if this means a
defautl set is then used...but if not ?  Wouldnt they try integrating using
both types of formula, or would it be just using coulumb or vice versa? (dont
know what that would do to the code but assume it means no vdw, and all
coulumb but then zeros are alwyas a problem for computers).



The setup is for the all-vs-all kernels.  Setting cutoffs equal to zero and 
using a fixed neighbor list triggers these special optimized kernels.  I have 
also noticed that long, finite cutoffs (on the order of 4.0 nm) lead to 
unacceptable energy drift and structural instability in well-behaved systems 
(even the benchmarks).  For instance, the backbone RMSD of lysozyme is twice as 
large in the case of a 4.0-nm cutoff relative to the all-vs-all setup, and the 
energy drift is quite substantial.


-Justin

--


Justin A. Lemkul, Ph.D.
Research Scientist
Department of Biochemistry
Virginia Tech
Blacksburg, VA
jalemkul[at]vt.edu | (540) 231-9080
http://www.bevanlab.biochem.vt.edu/Pages/Personal/justin


--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU crashes

2012-06-07 Thread lloyd riggs
Did you play with the time step?  Just currious, but I woundered what happened 
with 0.0008, 0.0005, 0.0002.  I found if I had a good behaving protein, as soon 
as I added a small (non-protein) molecule which rotated wildly while attached 
to the protein, it would crash unless I reduced the time step to the above when 
constraints were removed after EQ ... always it seemed to me it didnt like the 
rotation or bond angles, seeing them as a violation but acted like it was an 
amino acid? (the same bond type but with wider rotation as one end wasnt fixed 
to a chain)  If your loop moves via backbone, the calculated angles, bonds or 
whatever might appear to the computer to be violating the parameter settings 
for problems, errors, etc as it cant track them fast enough over the time step. 
Ie atom 1-2-3 and then delta 1-2-3 with xyz parameters, but then the particular 
set has additional rotation, etc and may include the chain atoms which bend 
wildly (n-Ca-Cb-Cg maybe a dihedral) but probab
 ly not this. 

Just a thought but probably not the right answere as well, it might be the way 
it is broken down (above) over GPUs, which convert everything to matricies 
(non-standard just for basic math operations not real matricies per say) for 
exicution and then some library problem which would not account for long range 
rapid (0.0005) movements at the chain (Ca,N,O to something else) and then tries 
to apply these to Cb-Cg-O-H, etc using the initial points while looking at the 
parameters for say a single amino acid...Maybe the constraints would cause 
this, which would make it a pain to EQ, but this allowed me to increase the 
time step, but would ruin the experiment I had worked on as I needed it 
unconstrained to show it didnt float away when proteins were pulled, etc...I 
was using a different integrator though...just normal MD.  

ANd your cutoffs for vdw, etc...Why are they 0?  I dont know if this means a 
defautl set is then used...but if not ?  Wouldnt they try integrating using 
both types of formula, or would it be just using coulumb or vice versa? (dont 
know what that would do to the code but assume it means no vdw, and all coulumb 
but then zeros are alwyas a problem for computers).  

Thats my thoughts on that.  Probably something else though.

Good luck,

Stephan

 Original-Nachricht 
> Datum: Wed, 06 Jun 2012 18:42:45 -0400
> Von: "Justin A. Lemkul" 
> An: Discussion list for GROMACS users 
> Betreff: [gmx-users] GPU crashes

> 
> Hi All,
> 
> I'm wondering if anyone has experienced what I'm seeing with Gromacs 4.5.5
> on 
> GPU.  It seems that certain systems fail inexplicably.  The system I am
> working 
> with is a heterodimeric protein complex bound to DNA.  After about 1 ns of
> simulation time using mdrun-gpu, all the energies become NaN.  The
> simulations 
> don't stop, they just carry on merrily producing nonsense.  I would love
> to see 
> some action regarding http://redmine.gromacs.org/issues/941 for this
> reason ;)
> 
> I ran simulations of each of the components of the system individually -
> each 
> protein alone, and DNA - to try to track down what might be causing this 
> problem.  The DNA simulation is perfectly stable out to 10 ns, but each
> protein 
> fails within 2 ns.  Each protein has two domains with a flexible linker,
> and it 
> seems that as soon as the linker flexes a bit, the simulations go poof. 
> Well-behaved proteins like lysozyme and DHFR (from the benchmark set) seem
> fine, 
> but anything that twitches even a small amount fails.  This is very
> unfortunate 
> for us, as we are hoping to see domain motions on a feasible time scale
> using 
> implicit solvent on GPU hardware.
> 
> Has anyone seen anything like this?  Our Gromacs implementation is being
> run on 
> an x86_64 Linux system with Tesla S2050 GPU cards.  The CUDA version is
> 3.1 and 
> Gromacs is linked against OpenMM-2.0.  An .mdp file is appended below.  I
> have 
> also tested finite values for cutoffs, but the results were worse
> (failures 
> occurred more quickly).
> 
> I have not been able to use the latest git version of Gromacs to test
> whether 
> anything has been fixed, but will post separately to gmx-developers
> regarding 
> the reasons for that soon.
> 
> -Justin
> 
> === md.mdp ===
> 
> title   = Implicit solvent test
> ; Run parameters
> integrator  = sd
> dt  = 0.002
> nsteps  = 500   ; 1 ps (10 ns)
> nstcomm = 1
> comm_mode   = angular   ; non-periodic system
> ; Output parameters
> nstxout = 0
> nstvout = 0
> nstfout = 0
> nstxtcout   = 1000  ; every 2 ps
> nstlog  = 5000  ; every 10 ps
> nstenergy   = 1000  ; every 2 ps
> ; Bond parameters
> constraint_algorithm= lincs
> constraints = all-bonds
> continuation= no; starting up
> ; required cutoffs for implicit
> nstlist = 0
> ns_type = grid
> rlist   = 0

Re: [gmx-users] GPU gets faster with more molecules in system

2011-01-24 Thread Mark Abraham

On 25/01/2011 8:25 AM, Christian Mötzing wrote:

Hi,

I compiled mdrun-gpu and tried some waterbox systems with different
atoms counts.

atoms  | GPU| CPU
2.400  | 1.015s | 774s
4.800  | 1.225s | 1.202s
9.600  | 1.142s | 1.353s
19.200 | 2.984s | 2.812s

Why does the system with 9.600 atoms finish faster than the one with
4.800? I tripple checked the simualtions and even GROMACs tells me that
the atom count in the system is like above. So I think no mistaken
there. A diff of md.log only shows differences in output values for each
step.

Is there any explanation for this behaviour?


As a guess, the cost of overheads for molecular simulations tend to have 
a weaker dependence on system size than the cost of computation (or none 
at all). Only once the latter dominate the cost do you see scaling with 
system size.


I expect you'd see similar behaviour running systems with 64, 128, 256, 
512 atoms on 64 processors.


Mark
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.

Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] gpu

2010-11-07 Thread Erik Wensink
tnx.
Erik

--- On Sun, 11/7/10, Rossen Apostolov  wrote:

From: Rossen Apostolov 
Subject: Re: [gmx-users] gpu
To: gmx-users@gromacs.org
Date: Sunday, November 7, 2010, 4:27 PM



  


  Hi,



Did you read this? http://www.gromacs.org/gpu



Rossen



On 11/7/10 1:23 PM, Erik Wensink wrote:

  
  

  
Dear gmx-users,

  How to invoke the gpu for simulations, e.g. is there
  (compiler) flag?

  Cheers,

  Erik


  

  
  




  

-Inline Attachment Follows-

-- 
gmx-users mailing list    gmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


  -- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] gpu

2010-11-07 Thread Rossen Apostolov

Hi,

Did you read this? http://www.gromacs.org/gpu

Rossen

On 11/7/10 1:23 PM, Erik Wensink wrote:

Dear gmx-users,
How to invoke the gpu for simulations, e.g. is there (compiler) flag?
Cheers,
Erik




-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

Re: [gmx-users] GPU slower than I7

2010-10-25 Thread Renato Freitas
Hi,

My OS is Fedora 13 (64 bits) and I used gcc 4.4.4. I ran the program
you sent me. Bellow are the results of 5 runs. As you can see the
results are rougly the same

[ren...@scrat ~]$ ./time
2.09 2.102991
[ren...@scrat ~]$ ./time
2.09 2.102808
[ren...@scrat ~]$ ./time
2.09 2.104577
[ren...@scrat ~]$ ./time
2.09 2.103943
[ren...@scrat ~]$ ./time
2.09 2.104471

Bellow are part of the /src/configure.h
.
.
.

/* Define to 1 if you have the MSVC _aligned_malloc() function. */
/* #undef HAVE__ALIGNED_MALLOC */

/* Define to 1 if you have the gettimeofday() function. */
#define HAVE_GETTIMEOFDAY

/* Define to 1 if you have the cbrt() function. */
#define HAVE_CBRT
.
.
.

 Is this OK?

Renato




2010/10/22 Roland Schulz :
> Hi,
>
> On Fri, Oct 22, 2010 at 3:20 PM, Renato Freitas  wrote:
>>
>> Do you think that the "NODE" and "Real" time difference could be
>> attributed to some compilation problem in the mdrun-gpu. Despite I'm
>> asking this I didn't get any error in the compilation.
>
> It is very odd that these are different for you system. What operating
> system and compiler do you use?
> Is HAVE_GETTIMEOFDAY set in src/config.h?
> I attached a small test program which uses the two different timers used for
> NODE and Real time. You can compile it with cc time.c -o time and run it
> with ./time. Do you get roughly the same time twice with the test program or
> do you see the same discrepancy as with GROMACS?
> Roland
>>
>> Thanks,
>>
>> Renato
>>
>> 2010/10/22 Szilárd Páll :
>> > Hi Renato,
>> >
>> > First of all, what you're seeing is pretty normal, especially that you
>> > have a CPU that is crossing the border of insane :) Why is it normal?
>> > The PME algorithms are just simply not very well not well suited for
>> > current GPU architectures. With an ill-suited algorithm you won't be
>> > able to see the speedups you can often see in other application areas
>> > - -even more so that you're comparing to Gromacs on a i7 980X. For
>> > more info + benchmarks see the Gromacs-GPU page:
>> > http://www.gromacs.org/gpu
>> >
>> > However, there is one strange thing you also pointed out. The fact
>> > that the "NODE" and "Real" time in your mdrun-gpu timing summary is
>> > not the same, but has 3x deviation is _very_ unusual. I've ran
>> > mdrun-gpu on quite a wide variety of hardware but I've never seen
>> > those two counter deviate. It might be an artifact from the cycle
>> > counters used internally that behave in an unusual way on your CPU.
>> >
>> > One other thing I should point out is that you would be better off
>> > using the standard mdrun which in 4.5 by default has thread-support
>> > and therefore will run on a single cpu/node without MPI!
>> >
>> > Cheers,
>> > --
>> > Szilárd
>> >
>> >
>> >
>> > On Thu, Oct 21, 2010 at 9:18 PM, Renato Freitas 
>> > wrote:
>> >> Hi gromacs users,
>> >>
>> >> I have installed the lastest version of gromacs (4.5.1) in an i7 980X
>> >> (6 cores or 12 with HT on; 3.3 GHz) with 12GB of RAM and compiled its
>> >> mpi version. Also I compiled the GPU-accelerated
>> >> version of gromacs. Then I did a  2 ns simulation using a small system
>> >> (11042 atoms)  to compare the performance of mdrun-gpu vs mdrun_mpi.
>> >> The results that I got are bellow:
>> >>
>> >> 
>> >> My *.mdp is:
>> >>
>> >> constraints         =  all-bonds
>> >> integrator          =  md
>> >> dt                  =  0.002    ; ps !
>> >> nsteps              =  100  ; total 2000 ps.
>> >> nstlist             =  10
>> >> ns_type             =  grid
>> >> coulombtype    = PME
>> >> rvdw                = 0.9
>> >> rlist               = 0.9
>> >> rcoulomb            = 0.9
>> >> fourierspacing      = 0.10
>> >> pme_order           = 4
>> >> ewald_rtol          = 1e-5
>> >> vdwtype             =  cut-off
>> >> pbc                 =  xyz
>> >> epsilon_rf    =  0
>> >> comm_mode           =  linear
>> >> nstxout             =  1000
>> >> nstvout             =  0
>> >> nstfout             =  0
>> >> nstxtcout           =  1000
>> >> nstlog              =  1000
>> >> nstenergy           =  1000
>> >> ; Berendsen temperature coupling is on in four groups
>> >> tcoupl              = berendsen
>> >> tc-grps             = system
>> >> tau-t               = 0.1
>> >> ref-t               = 298
>> >> ; Pressure coupling is on
>> >> Pcoupl = berendsen
>> >> pcoupltype = isotropic
>> >> tau_p = 0.5
>> >> compressibility = 4.5e-5
>> >> ref_p = 1.0
>> >> ; Generate velocites is on at 298 K.
>> >> gen_vel = no
>> >>
>> >> 
>> >> RUNNING GROMACS ON GPU
>> >>
>> >> mdrun-gpu -s topol.tpr -v > & out &
>> >>
>> >> Here is a part of the md.log:
>> >>
>> >> Started mdrun on node 0 Wed Oct 20 09:52:09 2010
>> >> .
>> >> .
>> >> .
>> >>     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>> >>
>> >>  Computing:     Nodes   Number          G-Cycles        Seconds     %
>> >>
>> >> ---

Re: [gmx-users] GPU slower than I7

2010-10-22 Thread Roland Schulz
Hi,

On Fri, Oct 22, 2010 at 3:20 PM, Renato Freitas  wrote:
>
>
> Do you think that the "NODE" and "Real" time difference could be
> attributed to some compilation problem in the mdrun-gpu. Despite I'm
> asking this I didn't get any error in the compilation.
>

It is very odd that these are different for you system. What operating
system and compiler do you use?

Is HAVE_GETTIMEOFDAY set in src/config.h?

I attached a small test program which uses the two different timers used for
NODE and Real time. You can compile it with cc time.c -o time and run it
with ./time. Do you get roughly the same time twice with the test program or
do you see the same discrepancy as with GROMACS?

Roland

Thanks,
>
> Renato
>
> 2010/10/22 Szilárd Páll :
> > Hi Renato,
> >
> > First of all, what you're seeing is pretty normal, especially that you
> > have a CPU that is crossing the border of insane :) Why is it normal?
> > The PME algorithms are just simply not very well not well suited for
> > current GPU architectures. With an ill-suited algorithm you won't be
> > able to see the speedups you can often see in other application areas
> > - -even more so that you're comparing to Gromacs on a i7 980X. For
> > more info + benchmarks see the Gromacs-GPU page:
> > http://www.gromacs.org/gpu
> >
> > However, there is one strange thing you also pointed out. The fact
> > that the "NODE" and "Real" time in your mdrun-gpu timing summary is
> > not the same, but has 3x deviation is _very_ unusual. I've ran
> > mdrun-gpu on quite a wide variety of hardware but I've never seen
> > those two counter deviate. It might be an artifact from the cycle
> > counters used internally that behave in an unusual way on your CPU.
> >
> > One other thing I should point out is that you would be better off
> > using the standard mdrun which in 4.5 by default has thread-support
> > and therefore will run on a single cpu/node without MPI!
> >
> > Cheers,
> > --
> > Szilárd
> >
> >
> >
> > On Thu, Oct 21, 2010 at 9:18 PM, Renato Freitas 
> wrote:
> >> Hi gromacs users,
> >>
> >> I have installed the lastest version of gromacs (4.5.1) in an i7 980X
> >> (6 cores or 12 with HT on; 3.3 GHz) with 12GB of RAM and compiled its
> >> mpi version. Also I compiled the GPU-accelerated
> >> version of gromacs. Then I did a  2 ns simulation using a small system
> >> (11042 atoms)  to compare the performance of mdrun-gpu vs mdrun_mpi.
> >> The results that I got are bellow:
> >>
> >> 
> >> My *.mdp is:
> >>
> >> constraints =  all-bonds
> >> integrator  =  md
> >> dt  =  0.002; ps !
> >> nsteps  =  100  ; total 2000 ps.
> >> nstlist =  10
> >> ns_type =  grid
> >> coulombtype= PME
> >> rvdw= 0.9
> >> rlist   = 0.9
> >> rcoulomb= 0.9
> >> fourierspacing  = 0.10
> >> pme_order   = 4
> >> ewald_rtol  = 1e-5
> >> vdwtype =  cut-off
> >> pbc =  xyz
> >> epsilon_rf=  0
> >> comm_mode   =  linear
> >> nstxout =  1000
> >> nstvout =  0
> >> nstfout =  0
> >> nstxtcout   =  1000
> >> nstlog  =  1000
> >> nstenergy   =  1000
> >> ; Berendsen temperature coupling is on in four groups
> >> tcoupl  = berendsen
> >> tc-grps = system
> >> tau-t   = 0.1
> >> ref-t   = 298
> >> ; Pressure coupling is on
> >> Pcoupl = berendsen
> >> pcoupltype = isotropic
> >> tau_p = 0.5
> >> compressibility = 4.5e-5
> >> ref_p = 1.0
> >> ; Generate velocites is on at 298 K.
> >> gen_vel = no
> >>
> >> 
> >> RUNNING GROMACS ON GPU
> >>
> >> mdrun-gpu -s topol.tpr -v > & out &
> >>
> >> Here is a part of the md.log:
> >>
> >> Started mdrun on node 0 Wed Oct 20 09:52:09 2010
> >> .
> >> .
> >> .
> >> R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
> >>
> >>  Computing: Nodes   Number  G-CyclesSeconds %
> >>
> --
> >>  Write traj.1   1021106.075 31.7
>0.2
> >>  Rest   1   64125.577   19178.6
> 99.8
> >>
> --
> >>  Total  1   64231.652   19210.3 100.0
> >>
> --
> >>
> >>NODE (s)Real (s)
>  (%)
> >>   Time:6381.84019210.349   33.2
> >>   1h46:21
> >>(Mnbf/s)   (MFlops) (ns/day)(hour/ns)
> >> Performance:0.000   0.001   27.077  0.886
> >>
> >> Finished mdrun on node 0 Wed Oct 2

Re: [gmx-users] GPU slower than I7

2010-10-22 Thread Renato Freitas
Hi Szilárd,

Thans for your explanation. Do you know if there will be a new
improvement of PME algorithms to take the full advantage of GPU video
cards?

Do you think that the "NODE" and "Real" time difference could be
attributed to some compilation problem in the mdrun-gpu. Despite I'm
asking this I didn't get any error in the compilation.

Thanks,

Renato

2010/10/22 Szilárd Páll :
> Hi Renato,
>
> First of all, what you're seeing is pretty normal, especially that you
> have a CPU that is crossing the border of insane :) Why is it normal?
> The PME algorithms are just simply not very well not well suited for
> current GPU architectures. With an ill-suited algorithm you won't be
> able to see the speedups you can often see in other application areas
> - -even more so that you're comparing to Gromacs on a i7 980X. For
> more info + benchmarks see the Gromacs-GPU page:
> http://www.gromacs.org/gpu
>
> However, there is one strange thing you also pointed out. The fact
> that the "NODE" and "Real" time in your mdrun-gpu timing summary is
> not the same, but has 3x deviation is _very_ unusual. I've ran
> mdrun-gpu on quite a wide variety of hardware but I've never seen
> those two counter deviate. It might be an artifact from the cycle
> counters used internally that behave in an unusual way on your CPU.
>
> One other thing I should point out is that you would be better off
> using the standard mdrun which in 4.5 by default has thread-support
> and therefore will run on a single cpu/node without MPI!
>
> Cheers,
> --
> Szilárd
>
>
>
> On Thu, Oct 21, 2010 at 9:18 PM, Renato Freitas  wrote:
>> Hi gromacs users,
>>
>> I have installed the lastest version of gromacs (4.5.1) in an i7 980X
>> (6 cores or 12 with HT on; 3.3 GHz) with 12GB of RAM and compiled its
>> mpi version. Also I compiled the GPU-accelerated
>> version of gromacs. Then I did a  2 ns simulation using a small system
>> (11042 atoms)  to compare the performance of mdrun-gpu vs mdrun_mpi.
>> The results that I got are bellow:
>>
>> 
>> My *.mdp is:
>>
>> constraints         =  all-bonds
>> integrator          =  md
>> dt                  =  0.002    ; ps !
>> nsteps              =  100  ; total 2000 ps.
>> nstlist             =  10
>> ns_type             =  grid
>> coulombtype    = PME
>> rvdw                = 0.9
>> rlist               = 0.9
>> rcoulomb            = 0.9
>> fourierspacing      = 0.10
>> pme_order           = 4
>> ewald_rtol          = 1e-5
>> vdwtype             =  cut-off
>> pbc                 =  xyz
>> epsilon_rf    =  0
>> comm_mode           =  linear
>> nstxout             =  1000
>> nstvout             =  0
>> nstfout             =  0
>> nstxtcout           =  1000
>> nstlog              =  1000
>> nstenergy           =  1000
>> ; Berendsen temperature coupling is on in four groups
>> tcoupl              = berendsen
>> tc-grps             = system
>> tau-t               = 0.1
>> ref-t               = 298
>> ; Pressure coupling is on
>> Pcoupl = berendsen
>> pcoupltype = isotropic
>> tau_p = 0.5
>> compressibility = 4.5e-5
>> ref_p = 1.0
>> ; Generate velocites is on at 298 K.
>> gen_vel = no
>>
>> 
>> RUNNING GROMACS ON GPU
>>
>> mdrun-gpu -s topol.tpr -v > & out &
>>
>> Here is a part of the md.log:
>>
>> Started mdrun on node 0 Wed Oct 20 09:52:09 2010
>> .
>> .
>> .
>>     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>>
>>  Computing:     Nodes   Number          G-Cycles        Seconds     %
>> --
>>  Write traj.    1               1021                    106.075 31.7         
>>    0.2
>>  Rest                   1               64125.577               19178.6 99.8
>> --
>>  Total          1               64231.652               19210.3 100.0
>> --
>>
>>                        NODE (s)                Real (s)                (%)
>>       Time:    6381.840                19210.349               33.2
>>                       1h46:21
>>                        (Mnbf/s)   (MFlops)     (ns/day)        (hour/ns)
>> Performance:    0.000   0.001   27.077  0.886
>>
>> Finished mdrun on node 0 Wed Oct 20 15:12:19 2010
>>
>> 
>> RUNNING GROMACS ON MPI
>>
>> mpirun -np 6 mdrun_mpi -s topol.tpr -npme 3 -v > & out &
>>
>> Here is a part of the md.log:
>>
>> Started mdrun on node 0 Wed Oct 20 18:30:52 2010
>>
>>     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>>
>>  Computing:             Nodes   Number  G-Cycles    Seconds             %
>> --
>>  Domain decomp. 3              11     1452.166      434.7      

Re: [gmx-users] GPU slower than I7

2010-10-22 Thread Renato Freitas
Hi Roland,

In fact I get better performance values using different rcoulomb,
fourierspacing and the values of -npme suggested by g_tune_pme using
-nt=12.

The simulation  using GPU was carried out using the dedicated machine,
no other programs was runnig, even the graphical interface was
stopped.

About the CPU vs GPU simulation time, Szilárd explained that the PME
algorithms still are not very well suited for current GPU
architectures. I just don't know why the NODE and REAL times are not
equal.

Thanks,

Renato


2010/10/21 Roland Schulz :
>
>
> On Thu, Oct 21, 2010 at 5:53 PM, Renato Freitas  wrote:
>>
>> Thanks Roland. I will do a newer test using the fourier spacing equal
>> to 0.11.
>
> I'd also suggest to look at g_tune_pme and run with different rcoulomb,
> fourier_spacing. As long as the ratio is the same you get the same accuracy.
> And you should get better performance (especially on the GPU) for longer
> cut-off and larger grid-spacing.
>
>>
>> However, about the performance of GPU versus CPU (mpi) let me
>> try to explain it better:
>
>
>>
>> GPU
>>
>>             NODE (s)                Real (s)                (%)
>> Time:    6381.840                19210.349            33.2
>>                            1h46:21
>>                         (Mnbf/s)   (MFlops)     (ns/day)        (hour/ns)
>> Performance:    0.000       0.001          27.077          0.886
>>
>> MPI
>>
>>             NODE (s)         Real (s)                    (%)
>> Time:    12621.257       12621.257               100.0
>>                           3h30:21
>>                      (Mnbf/s)      (GFlops)     (ns/day)        (hour/ns)
>> Performance: 388.633      28.773        13.691         1.753
>
> Yes. Sorry I didn't realize that NODE time and  Real time is different. Did
> you run the GPU calculation on a desktop machine which was also doing other
> things at the time. This might explain it. As far as I know for a dedicated
> machine not running any other programs NODE and Real time should be the
> same.
>>
>> Looking abobe we can see that the gromacs prints in the output that
>> the simulation is faster when the GPU is used. But this is not the
>> reality. The truth is that simulation time with MPI was 106 min faster
>> thatn that with GPU. It seems correct to you? As I said before, I was
>> expecting that GPU should take a lower time than the 6 core MPI.
>
>  Well the exact time depends on a lot of factors. And you probably can speed
> up both. But I would expect them to be both about similar fast.
> Roland
> --
> gmx-users mailing list    gmx-us...@gromacs.org
> http://lists.gromacs.org/mailman/listinfo/gmx-users
> Please search the archive at
> http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
> Please don't post (un)subscribe requests to the list. Use the
> www interface or send it to gmx-users-requ...@gromacs.org.
> Can't post? Read http://www.gromacs.org/Support/Mailing_Lists
>
--
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists


Re: [gmx-users] GPU slower than I7

2010-10-22 Thread Szilárd Páll
Hi Renato,

First of all, what you're seeing is pretty normal, especially that you
have a CPU that is crossing the border of insane :) Why is it normal?
The PME algorithms are just simply not very well not well suited for
current GPU architectures. With an ill-suited algorithm you won't be
able to see the speedups you can often see in other application areas
- -even more so that you're comparing to Gromacs on a i7 980X. For
more info + benchmarks see the Gromacs-GPU page:
http://www.gromacs.org/gpu

However, there is one strange thing you also pointed out. The fact
that the "NODE" and "Real" time in your mdrun-gpu timing summary is
not the same, but has 3x deviation is _very_ unusual. I've ran
mdrun-gpu on quite a wide variety of hardware but I've never seen
those two counter deviate. It might be an artifact from the cycle
counters used internally that behave in an unusual way on your CPU.

One other thing I should point out is that you would be better off
using the standard mdrun which in 4.5 by default has thread-support
and therefore will run on a single cpu/node without MPI!

Cheers,
--
Szilárd



On Thu, Oct 21, 2010 at 9:18 PM, Renato Freitas  wrote:
> Hi gromacs users,
>
> I have installed the lastest version of gromacs (4.5.1) in an i7 980X
> (6 cores or 12 with HT on; 3.3 GHz) with 12GB of RAM and compiled its
> mpi version. Also I compiled the GPU-accelerated
> version of gromacs. Then I did a  2 ns simulation using a small system
> (11042 atoms)  to compare the performance of mdrun-gpu vs mdrun_mpi.
> The results that I got are bellow:
>
> 
> My *.mdp is:
>
> constraints         =  all-bonds
> integrator          =  md
> dt                  =  0.002    ; ps !
> nsteps              =  100  ; total 2000 ps.
> nstlist             =  10
> ns_type             =  grid
> coulombtype    = PME
> rvdw                = 0.9
> rlist               = 0.9
> rcoulomb            = 0.9
> fourierspacing      = 0.10
> pme_order           = 4
> ewald_rtol          = 1e-5
> vdwtype             =  cut-off
> pbc                 =  xyz
> epsilon_rf    =  0
> comm_mode           =  linear
> nstxout             =  1000
> nstvout             =  0
> nstfout             =  0
> nstxtcout           =  1000
> nstlog              =  1000
> nstenergy           =  1000
> ; Berendsen temperature coupling is on in four groups
> tcoupl              = berendsen
> tc-grps             = system
> tau-t               = 0.1
> ref-t               = 298
> ; Pressure coupling is on
> Pcoupl = berendsen
> pcoupltype = isotropic
> tau_p = 0.5
> compressibility = 4.5e-5
> ref_p = 1.0
> ; Generate velocites is on at 298 K.
> gen_vel = no
>
> 
> RUNNING GROMACS ON GPU
>
> mdrun-gpu -s topol.tpr -v > & out &
>
> Here is a part of the md.log:
>
> Started mdrun on node 0 Wed Oct 20 09:52:09 2010
> .
> .
> .
>     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>
>  Computing:     Nodes   Number          G-Cycles        Seconds     %
> --
>  Write traj.    1               1021                    106.075 31.7          
>   0.2
>  Rest                   1               64125.577               19178.6 99.8
> --
>  Total          1               64231.652               19210.3 100.0
> --
>
>                        NODE (s)                Real (s)                (%)
>       Time:    6381.840                19210.349               33.2
>                       1h46:21
>                        (Mnbf/s)   (MFlops)     (ns/day)        (hour/ns)
> Performance:    0.000   0.001   27.077  0.886
>
> Finished mdrun on node 0 Wed Oct 20 15:12:19 2010
>
> 
> RUNNING GROMACS ON MPI
>
> mpirun -np 6 mdrun_mpi -s topol.tpr -npme 3 -v > & out &
>
> Here is a part of the md.log:
>
> Started mdrun on node 0 Wed Oct 20 18:30:52 2010
>
>     R E A L   C Y C L E   A N D   T I M E   A C C O U N T I N G
>
>  Computing:             Nodes   Number  G-Cycles    Seconds             %
> --
>  Domain decomp. 3              11     1452.166      434.7             0.6
>  DD comm. load          3              10001        0.745          0.2
>       0.0
>  Send X to PME         3              101    249.003       74.5
>          0.1
>  Comm. coord.           3              101   637.329        190.8
>          0.3
>  Neighbor search        3              11     8738.669      2616.0
>         3.5
>  Force                       3              101   99210.202
> 29699.2        39.2
>  Wait + Comm. F       3              101   3361.591       1006.3         
> 1.3
> 

Re: [gmx-users] GPU slower than I7

2010-10-21 Thread Roland Schulz
On Thu, Oct 21, 2010 at 5:53 PM, Renato Freitas  wrote:

> Thanks Roland. I will do a newer test using the fourier spacing equal
> to 0.11.

I'd also suggest to look at g_tune_pme and run with different rcoulomb,
fourier_spacing. As long as the ratio is the same you get the same accuracy.
And you should get better performance (especially on the GPU) for longer
cut-off and larger grid-spacing.


> However, about the performance of GPU versus CPU (mpi) let me
> try to explain it better:
>


> GPU
>
> NODE (s)Real (s)(%)
> Time:6381.84019210.34933.2
>1h46:21
> (Mnbf/s)   (MFlops) (ns/day)(hour/ns)
> Performance:0.000   0.001  27.077  0.886
>
> MPI
>
> NODE (s) Real (s)(%)
> Time:12621.257   12621.257   100.0
>   3h30:21
>  (Mnbf/s)  (GFlops) (ns/day)(hour/ns)
> Performance: 388.633  28.77313.691 1.753
>

Yes. Sorry I didn't realize that NODE time and  Real time is different. Did
you run the GPU calculation on a desktop machine which was also doing other
things at the time. This might explain it. As far as I know for a dedicated
machine not running any other programs NODE and Real time should be the
same.

Looking abobe we can see that the gromacs prints in the output that
> the simulation is faster when the GPU is used. But this is not the
> reality. The truth is that simulation time with MPI was 106 min faster
> thatn that with GPU. It seems correct to you? As I said before, I was
> expecting that GPU should take a lower time than the 6 core MPI.
>
 Well the exact time depends on a lot of factors. And you probably can speed
up both. But I would expect them to be both about similar fast.

Roland
-- 
gmx-users mailing listgmx-users@gromacs.org
http://lists.gromacs.org/mailman/listinfo/gmx-users
Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/Search before posting!
Please don't post (un)subscribe requests to the list. Use the 
www interface or send it to gmx-users-requ...@gromacs.org.
Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

  1   2   >