[petsc-dev] [petsc-maint #88993] Petsc with Cuda 4.0 and Multiple GPUs

Matthew Knepley Sat, 1 Oct 2011 21:54:11 -0500

This diagnosis is total crap (I think), as I tried to explain. We would
never get the same result (or the right result),
and partitioning makes no sense. Something else is going on. Can't we run on
a 2 GPU system at ANL?


   Matt

On Sat, Oct 1, 2011 at 9:30 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On Oct 1, 2011, at 9:22 PM, Dave Nystrom wrote:
>
> > Hi Barry,
> >
> > I've sent a couple more emails on this topic.  What I am trying to do at
> the
> > moment is to figure out how to have a problem run on only one gpu if it
> will
> > fit in the memory of that gpu.  Back in April when I had built petsc-dev
> with
> > Cuda 3.2, petsc would only use one gpu if you had multiple gpus on your
> > machine.  In order to use multiple gpus for a problem, one had to use
> > multiple threads with a separate thread assigned to control each gpu.
>  But
> > Cuda 4.0 has, I believe, made that transparent and under the hood.  So
> now
> > when I run a small example problem such as
> > src/ksp/ksp/examples/tutorials/ex2f.F with an 800x800 problem, it gets
> > partitioned to run on both of the gpus in my machine.  The result is a
> very
> > large performance hit because of communication back and forth from one
> gpu to
> > the other via the cpu.
>
>     How do you know there is lots of communication from the GPU to the CPU?
> In the -log_summary? Nope because PETSc does not manage anything like that
> (that is one CPU process using both GPUs).
>
>
> > So this problem with a 3200x3200 grid runs 5x slower
> > now than it did with Cuda 3.2.  I believe if one is programming down at
> the
> > cuda level, it is possible to have a smaller problem run on only one gpu
> so
> > that there is communication only between the cpu and gpu and only at the
> > start and end of the calculation.
> >
> > To me, it seems like what is needed is a petsc option to specify the
> number
> > of gpus to run on that can somehow get passed down to the cuda level
> through
> > cusp and thrust.  I fear that the short term solution is going to have to
> be
> > for me to pull one of the gpus out of my desktop system but it would be
> nice
> > if there was a way to tell petsc and friends to just use one gpu when I
> want
> > it to.
> >
> > If necessary, I can send a couple of log files to demonstrate what I am
> > trying to describe regarding the performance hit.
>
>    I am not convinced that the poor performance you are getting now has
> anything to do with using both GPUs. Please run
> a PETSc program with the command -cuda_show_devices
>
>    What are the choices?  You can then pick one of them and run with
> -cuda_set_device integer
>
>    Does this change things?
>
>    Barry
>
> >
> > Thanks,
> >
> > Dave
> >
> > Barry Smith writes:
> >> Dave,
> >>
> >> We have no mechanism in the PETSc code for a PETSc single CPU process to
> >> use two GPUs at the same time. However you could have two MPI processes
> >> each using their own GPU.
> >>
> >> The one tricky part is you need to make sure each MPI process uses a
> >> different GPU. We currently do not have a mechanism to do this
> assignment
> >> automatically. I think it can be done with cudaSetDevice(). But I don't
> >> know the details, sending this to petsc-dev at mcs.anl.gov where more
> people
> >> may know.
> >>
> >> PETSc-folks,
> >>
> >> We need a way to have this setup automatically.
> >>
> >> Barry
> >>
> >> On Oct 1, 2011, at 5:43 PM, Dave Nystrom wrote:
> >>
> >>> I'm running petsc on a machine with Cuda 4.0 and 2 gpus.  This is a
> desktop
> >>> machine with a single processor.  I know that Cuda 4.0 has support for
> >>> running on multiple gpus but don't know if petsc uses that.  But
> suppose I
> >>> have a problem that will fit in the memory for a single gpu.  Will
> petsc run
> >>> the problem on a single gpu or does it split it between the 2 gpus and
> incur
> >>> the communication overhead of copying data between the two gpus?
> >>>
> >>> Thanks,
> >>>
> >>> Dave
> >>>
> >>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20111001/1298112d/attachment.html>

[petsc-dev] [petsc-maint #88993] Petsc with Cuda 4.0 and Multiple GPUs

Reply via email to