Barry, The CPU timing I reported was after recompiling the code (I removed PETSC_USE_DEBUG and GDB macros from petscconf.h).
Thanks, ================================ ?Keita Teranishi ?Scientific Library Group ?Cray, Inc. ?keita at cray.com ================================ -----Original Message----- From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-boun...@mcs.anl.gov] On Behalf Of Barry Smith Sent: Friday, August 27, 2010 3:37 PM To: For users of the development version of PETSc Subject: Re: [petsc-dev] [GPU] Performance on Fermi ########################################################## # # # WARNING!!! # # # # This code was compiled with a debugging option, # # To get timing results run ./configure # # using --with-debugging=no, the performance will # # be generally two or three times faster. # # # ########################################################## You need to build the code with ./configure --with-debugging=0 to make a far comparison. This will speed up the CPU version. Barry On Aug 27, 2010, at 2:22 PM, Keita Teranishi wrote: > Barry, > > CPU version takes another digit. So it is 1.6 sec on Fermi and 17 sec 1 core > CPU. > > Thanks, > ================================ > Keita Teranishi > Scientific Library Group > Cray, Inc. > keita at cray.com > ================================ > > > -----Original Message----- > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at > mcs.anl.gov] On Behalf Of Keita Teranishi > Sent: Friday, August 27, 2010 2:20 PM > To: For users of the development version of PETSc > Subject: Re: [petsc-dev] [GPU] Performance on Fermi > > Barry, > > Yes. It improves the performance dramatically, but the execution time for > KSPSolve stays the same. > > MatMult 5.2 Gflops > > Thanks, > > ================================ > Keita Teranishi > Scientific Library Group > Cray, Inc. > keita at cray.com > ================================ > > > -----Original Message----- > From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at > mcs.anl.gov] On Behalf Of Barry Smith > Sent: Friday, August 27, 2010 2:15 PM > To: For users of the development version of PETSc > Subject: [petsc-dev] [GPU] Performance on Fermi > > > PETSc-dev folks, > > Please prepend all messages to petsc-dev that involve GPUs with [GPU] so > they can be easily filtered. > > Keita, > > To run src/ksp/ksp/examples/tutorials/ex2.c with CUDA you need the flag > -vec_type cuda > > Note also that this example is fine for simple ONE processor tests but > should not be used for parallel testing because it does not do a proper > parallel partitioning for performance > > Barry > > On Aug 27, 2010, at 2:04 PM, Keita Teranishi wrote: > >> Hi, >> >> I ran ex2.c with a matrix from 512x512 grid. >> I set CG and Jacobi for the solver and preconditioner. >> GCC-4.4.4 and CUDA-3.1 are used to compile the code. >> BLAS and LAPAKCK are not optimized. >> >> MatMult >> Fermi: 1142 MFlops >> 1 core Istanbul: 420 MFlops >> >> KSPSolve: >> Fermi: 1.5 Sec >> 1 core Istanbul: 1.7 Sec >> >> >> ================================ >> Keita Teranishi >> Scientific Library Group >> Cray, Inc. >> keita at cray.com >> ================================ >> >> >> -----Original Message----- >> From: petsc-dev-bounces at mcs.anl.gov >> [mailto:petsc-dev-bounces at mcs.anl.gov] On Behalf Of Satish Balay >> Sent: Friday, August 27, 2010 1:49 PM >> To: For users of the development version of PETSc >> Subject: Re: [petsc-dev] Problem with petsc-dev >> >> On Fri, 27 Aug 2010, Satish Balay wrote: >> >>> There was a problem with tarball creation for the past few days. Will >>> try to respin manually today - and update you. >> >> the petsc-dev tarball is now updated on the website.. >> >> Satish >