Dang, I had two copies of thrust and cusp and was using the "wrong" one hence
everything was working for me.
Ok, I'll try to get txpetscgpu updated
Barry
On Oct 3, 2011, at 10:18 PM, Satish Balay wrote:
> /home/wdn/Projects/Petsc/src/branches/master/petsc-dev/LINUX_GNU_OPTIMI
So, for the moment, I have backed off using --download-txpetscgpu and can now
get a successful build that seems to work and recover my April performance
results on the gpu. I am very eager to try the --download-txpetscgpu option
though.
Thanks,
Dave
Dave Nystrom writes:
> Hi Barry,
>
> Just
>>>
/home/wdn/Projects/Petsc/src/branches/master/petsc-dev/LINUX_GNU_OPTIMIZE_SERIAL_CUDA_40_LITE/include/txpetscgpu/include/csr_spmv_part_vector_gpu.h:23:44:
error: thrust/detail/device/cuda/arch.h: No such file or directory
/usr/bin/ar: aijcusp.o: No such file or directory
<<<
This file is at .
Dave,
I have found the cause of the problem you were seeing and have fixed it. It
was caused by bad code when --download-txpetscgpu was used.
To eliminate the problem
1) upgrade to latest cusp and thrust via mecurial
2) rm -rf externpackages/txpetscgpu*
3) hg pull; hg update
4
On Sun, Oct 2, 2011 at 4:43 PM, Dave Nystrom
wrote:
> Dave Nystrom writes:
> > In case it might be useful, I have attached two log files of runs with
> the
> > ex2f petsc example from src/ksp/ksp/examples/tutorials. One was run
> back in
> > April with petsc-dev linked to Cuda 3.2. It shows e
On Sun, Oct 2, 2011 at 10:50 PM, Dave Nystrom wrote:
> Hi Barry,
>
> Barry Smith writes:
> > Dave,
> >
> > I cannot explain why it does not use the MatMult_SeqAIJCusp() - it does
> for me.
>
> Do you get good performance running a problem like ex2?
>
Okay, now the problem is clear. This does
Matthew Knepley writes:
> On Sun, Oct 2, 2011 at 10:50 PM, Dave Nystrom tachyonlogic.com> wrote:
>
> > Hi Barry,
> >
> > Barry Smith writes:
> > > Dave,
> > >
> > > I cannot explain why it does not use the MatMult_SeqAIJCusp() - it does
> > for me.
> >
> > Do you get good performan
Hi Barry,
Barry Smith writes:
> Dave,
>
> I cannot explain why it does not use the MatMult_SeqAIJCusp() - it does for
> me.
Do you get good performance running a problem like ex2?
> Have you updated to the latest cusp/thrust? From the mecurial repositories?
I did try the latest version o
Dave,
I cannot explain why it does not use the MatMult_SeqAIJCusp() it does for me.
Have you updated to the latest cusp/thrust? From the mecurial repositories
There is a difference, in your new 4.0 build you added
--download-txpetscgpu=yes BTW: that doesn't work for me with the la
On Oct 2, 2011, at 6:39 PM, Dave Nystrom wrote:
> Thanks for the update. I don't believe I have gotten a run with good
> performance yet, either from C or Fortran. I wish there was an easy way for
> me to force use of only one of my gpus. I don't want to have to pull one of
> the gpus in order
Barry Smith writes:
> On Oct 2, 2011, at 6:39 PM, Dave Nystrom wrote:
>
>> Thanks for the update. I don't believe I have gotten a run with good
>> performance yet, either from C or Fortran. I wish there was an easy way for
>> me to force use of only one of my gpus. I don't want to have to
It is not doing the MatMult operation on the GPU and hence needs to move the
vectors back and forth for each operation (since MatMult is done on the CPU
with the vector while vector operations are done on the GPU) hence the terrible
performance.
Not sure why yet. It is copying the Mat do
Thanks for the update. I don't believe I have gotten a run with good
performance yet, either from C or Fortran. I wish there was an easy way for
me to force use of only one of my gpus. I don't want to have to pull one of
the gpus in order to see if that is complicating things with Cuda 4.0. I'l
Dave Nystrom writes:
> In case it might be useful, I have attached two log files of runs with the
> ex2f petsc example from src/ksp/ksp/examples/tutorials. One was run back in
> April with petsc-dev linked to Cuda 3.2. It shows excellent runtime
> performance. The other was run today with pe
In case it might be useful, I have attached two log files of runs with the
ex2f petsc example from src/ksp/ksp/examples/tutorials. One was run back in
April with petsc-dev linked to Cuda 3.2. It shows excellent runtime
performance. The other was run today with petsc-dev checked out of the
mercur
Matthew Knepley writes:
> On Sat, Oct 1, 2011 at 11:26 PM, Dave Nystrom tachyonlogic.com> wrote:
> > Barry Smith writes:
> > > On Oct 1, 2011, at 9:22 PM, Dave Nystrom wrote:
> > > > Hi Barry,
> > > >
> > > > I've sent a couple more emails on this topic. What I am trying to do
> > at
On Sat, Oct 1, 2011 at 11:26 PM, Dave Nystrom wrote:
> Barry Smith writes:
> >
> > On Oct 1, 2011, at 9:22 PM, Dave Nystrom wrote:
> >
> > > Hi Barry,
> > >
> > > I've sent a couple more emails on this topic. What I am trying to do
> at the
> > > moment is to figure out how to have a prob
our testbox has 2gpus
balay at bb30:~>lspci |grep -i nvidia
0b:00.0 3D controller: nVidia Corporation GT200 [Tesla C1060] (rev a1)
0c:00.0 3D controller: nVidia Corporation GT200 [Tesla C1060] (rev a1)
balay at bb30:~>
Is there some test I can run on this? [it has cuda 4.0]
satish
On Sat, 1 Oc
This diagnosis is total crap (I think), as I tried to explain. We would
never get the same result (or the right result),
and partitioning makes no sense. Something else is going on. Can't we run on
a 2 GPU system at ANL?
Matt
On Sat, Oct 1, 2011 at 9:30 PM, Barry Smith wrote:
>
> On Oct 1, 2
On Oct 1, 2011, at 9:22 PM, Dave Nystrom wrote:
> Hi Barry,
>
> I've sent a couple more emails on this topic. What I am trying to do at the
> moment is to figure out how to have a problem run on only one gpu if it will
> fit in the memory of that gpu. Back in April when I had built petsc-dev w
Dave,
We have no mechanism in the PETSc code for a PETSc single CPU process to
use two GPUs at the same time. However you could have two MPI processes each
using their own GPU.
The one tricky part is you need to make sure each MPI process uses a
different GPU. We currently do not
21 matches
Mail list logo