Re: [gmx-users] Cuda CC 2.0 restrictions

2014-09-09 Thread Szilárd Páll
Hi,

Is this rather large box a system that can actually be simulated with
a useful speed on a single Fermi GPU? Even with 5 fs time-step you
won't get much more than 1-1.5 ns/day on a fast Fermi GPU like a GTX
580.

Given that you are quite a bit above the limit, unless you are using a
quite large nstlist, you may not be able to decrease it enough to fit
the system on the GPU. What you can do instead is to use
domain-decomposition and start multiple ranks per GPU. In your case
two-way DD should be enough.

Cheers,
--
Szilárd


On Fri, Sep 5, 2014 at 11:24 PM, Mirco Wahab
mirco.wa...@chemie.tu-freiberg.de wrote:
 I've run into a problem with an older card (GTX-580)
 which is CC 2.0. On a larger box size, mdrun stops
 with:

Fatal error:
Watch out, the input system is too large to simulate!
The number of nonbonded work units (=number of super-clusters)
exceeds themaximum grid size in x dimension (86276  65535)!


 This seems to refer to the CUDA grid of thread blocks per dimension,
 limitation specific for CC below 3.0 (I know this system worked already
 on a CC 3.0 device).

 My question: can this grid count calculated by mdrun be manipulated
 somehow by mdp options (nstlist, rvdw, rlist)?

 Thanks,

 M.
 --
 Gromacs Users mailing list

 * Please search the archive at
 http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

 * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

 * For (un)subscribe requests visit
 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a
 mail to gmx-users-requ...@gromacs.org.
-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


Re: [gmx-users] Cuda CC 2.0 restrictions

2014-09-06 Thread Mark Abraham
On Fri, Sep 5, 2014 at 11:24 PM, Mirco Wahab 
mirco.wa...@chemie.tu-freiberg.de wrote:

 I've run into a problem with an older card (GTX-580)
 which is CC 2.0. On a larger box size, mdrun stops
 with:

Fatal error:
Watch out, the input system is too large to simulate!
The number of nonbonded work units (=number of super-clusters)
exceeds themaximum grid size in x dimension (86276  65535)!


 This seems to refer to the CUDA grid of thread blocks per dimension,
 limitation specific for CC below 3.0 (I know this system worked already
 on a CC 3.0 device).

 My question: can this grid count calculated by mdrun be manipulated
 somehow by mdp options (nstlist, rvdw, rlist)?


The number of super-clusters grows as the size of the neighbour list, i.e.
something like rlist^3. By default, rlist is set based on a complex
diffusion-based heuristic using T, max(rvdw,rcoulomb),
verlet-buffer-tolerance and nstlist. The minimum value for nstlist is set
in the .mdp file, but gets increased based on the hardware using other
heuristics. Using a smaller nstlist will require a smaller buffer and thus
smaller rlist. This can be done with mdrun -nstlist x, which requires mdrun
to use x for nstlist, rather than the .mdp value or its own choice. That's
going to perform relatively worse than the default, because the CPU-only
neighbour searching will happen accordingly more often, but it's better
than not running at all :-D.

You could also mess around with reducing max(rcouloumb,rvdw) and
corresponding buffs to PME parameters, but demonstrating correctness and
achieving CPU-GPU load balance is less straightforward than the above.

Mark

Thanks,

 M.
 --
 Gromacs Users mailing list

 * Please search the archive at http://www.gromacs.org/
 Support/Mailing_Lists/GMX-Users_List before posting!

 * Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

 * For (un)subscribe requests visit
 https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or
 send a mail to gmx-users-requ...@gromacs.org.

-- 
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.


[gmx-users] Cuda CC 2.0 restrictions

2014-09-05 Thread Mirco Wahab

I've run into a problem with an older card (GTX-580)
which is CC 2.0. On a larger box size, mdrun stops
with:

   Fatal error:
   Watch out, the input system is too large to simulate!
   The number of nonbonded work units (=number of super-clusters)
   exceeds themaximum grid size in x dimension (86276  65535)!


This seems to refer to the CUDA grid of thread blocks per dimension,
limitation specific for CC below 3.0 (I know this system worked already
on a CC 3.0 device).

My question: can this grid count calculated by mdrun be manipulated
somehow by mdp options (nstlist, rvdw, rlist)?

Thanks,

M.
--
Gromacs Users mailing list

* Please search the archive at 
http://www.gromacs.org/Support/Mailing_Lists/GMX-Users_List before posting!

* Can't post? Read http://www.gromacs.org/Support/Mailing_Lists

* For (un)subscribe requests visit
https://maillist.sys.kth.se/mailman/listinfo/gromacs.org_gmx-users or send a 
mail to gmx-users-requ...@gromacs.org.