sure. will do.
On Wed, Mar 28, 2012 at 1:23 PM, Matthew Knepley <knepley at gmail.com> wrote: > On Wed, Mar 28, 2012 at 1:14 PM, David Fuentes <fuentesdt at gmail.com>wrote: > >> thanks! its running, but I seem to be getting different answer for >> cpu/gpu ? >> i had some floating point problems on this Tesla M2070 gpu before, but >> adding the '-arch=sm_20' option seemed to fix it last time. >> >> >> is the assembly in single precision ? my 'const PetscReal >> jacobianInverse' being passed in are doubles >> > > Yep, that is the problem. I have not tested anything in double. I have not > decided exactly how to handle it. Can you > make another ARCH --with-precision=single and make sure it works, and then > we can fix the double issue? > > Thanks, > > Matt > > >> SCRGP2$ ./ex52 -dim 2 -compute_function -show_residual -batch -gpu >> GPU layout grid(1,2,1) block(3,1,1) with 1 batches >> N_t: 3, N_cb: 1 >> Residual: >> Vector Object: 1 MPI processes >> type: seq >> 0 >> 755712 >> 0 >> -58720 >> -2953.13 >> 0.375 >> 1.50323e+07 >> 0.875 >> 0 >> SCRGP2$ >> SCRGP2$ ./ex52 -dim 2 -compute_function -show_residual -batch >> Residual: >> Vector Object: 1 MPI processes >> type: seq >> -0.25 >> -0.5 >> 0.25 >> -0.5 >> -1 >> 0.5 >> 0.25 >> 0.5 >> 0.75 >> >> >> >> >> >> On Wed, Mar 28, 2012 at 11:55 AM, Matthew Knepley <knepley at >> gmail.com>wrote: >> >>> On Wed, Mar 28, 2012 at 11:45 AM, David Fuentes <fuentesdt at >>> gmail.com>wrote: >>> >>>> The example seems to be running on cpu with '-batch' but i'm getting >>>> errors in line 323 with the '-gpu' option >>>> >>>> [0]PETSC ERROR: IntegrateElementBatchGPU() line 323 in >>>> src/snes/examples/tutorials/ex52_integrateElement.cu >>>> >>>> should this possibly be PetscScalar ? >>>> >>> >>> No. >>> >>> >>>> - ierr = cudaMalloc((void**) &d_coefficients, Ne*N_bt * >>>> sizeof(float));CHKERRQ(ierr); >>>> + ierr = cudaMalloc((void**) &d_coefficients, Ne*N_bt * >>>> sizeof(PetscScalar));CHKERRQ(ierr); >>>> >>>> >>>> SCRGP2$ python >>>> $PETSC_DIR/bin/pythonscripts/PetscGenerateFEMQuadrature.py 2 1 1 1 >>>> laplacian ex52.h >>>> ['/opt/apps/PETSC/petsc-dev/bin/pythonscripts/PetscGenerateFEMQuadrature.py', >>>> '2', '1', '1', '1', 'laplacian', 'ex52.h'] >>>> 2 1 1 1 laplacian >>>> [{(-1.0, -1.0): [(1.0, ())]}, {(1.0, -1.0): [(1.0, ())]}, {(-1.0, 1.0): >>>> [(1.0, ())]}] >>>> {0: {0: [0], 1: [1], 2: [2]}, 1: {0: [], 1: [], 2: []}, 2: {0: []}} >>>> Perm: [0, 1, 2] >>>> Creating /home/fuentes/snestutorial/ex52.h >>>> Creating /home/fuentes/snestutorial/ex52_gpu.h >>>> [{(-1.0, -1.0): [(1.0, ())]}, {(1.0, -1.0): [(1.0, ())]}, {(-1.0, 1.0): >>>> [(1.0, ())]}] >>>> {0: {0: [0], 1: [1], 2: [2]}, 1: {0: [], 1: [], 2: []}, 2: {0: []}} >>>> Perm: [0, 1, 2] >>>> Creating /home/fuentes/snestutorial/ex52_gpu_inline.h >>>> >>>> SCRGP2$ make ex52 >>>> /usr/bin/mpicxx -o ex52.o -c -O0 -g -fPIC >>>> -I/opt/apps/PETSC/petsc-dev/include >>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/include >>>> -I/opt/apps/cuda/4.1/cuda/include -I/opt/apps/PETSC/petsc-dev/include/sieve >>>> -I/opt/MATLAB/R2011a/extern/include -I/usr/include >>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/cbind/include >>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/forbind/include >>>> -I/usr/include/mpich2 -D__INSDIR__=src/snes/examples/tutorials/ ex52.c >>>> nvcc -O0 -g -arch=sm_20 -c --compiler-options="-O0 -g -fPIC >>>> -I/opt/apps/PETSC/petsc-dev/include >>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/include >>>> -I/opt/apps/cuda/4.1/cuda/include -I/opt/apps/PETSC/petsc-dev/include/sieve >>>> -I/opt/MATLAB/R2011a/extern/include -I/usr/include >>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/cbind/include >>>> -I/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/forbind/include >>>> -I/usr/include/mpich2 -D__INSDIR__=src/snes/examples/tutorials/" >>>> ex52_integrateElement.cu >>>> ex52_gpu_inline.h(7): warning: variable "points_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(21): warning: variable "Basis_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(7): warning: variable "points_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(13): warning: variable "weights_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(21): warning: variable "Basis_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(28): warning: variable "BasisDerivatives_0" was >>>> declared but never referenced >>>> >>>> ex52_gpu_inline.h(7): warning: variable "points_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(21): warning: variable "Basis_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(7): warning: variable "points_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(13): warning: variable "weights_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(21): warning: variable "Basis_0" was declared but >>>> never referenced >>>> >>>> ex52_gpu_inline.h(28): warning: variable "BasisDerivatives_0" was >>>> declared but never referenced >>>> >>>> /usr/bin/mpicxx -O0 -g -o ex52 ex52.o ex52_integrateElement.o >>>> >>>> -Wl,-rpath,/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/lib >>>> -L/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/lib -lpetsc >>>> -Wl,-rpath,/opt/apps/PETSC/petsc-dev/gcc-4.4.3-mpich2-1.2-epd-sm_20-dbg/lib >>>> -ltriangle -lX11 -lpthread -lsuperlu_dist_3.0 -lcmumps -ldmumps -lsmumps >>>> -lzmumps -lmumps_common -lpord -lparmetis -lmetis -lscalapack -lblacs >>>> -Wl,-rpath,/opt/apps/cuda/4.1/cuda/lib64 -L/opt/apps/cuda/4.1/cuda/lib64 >>>> -lcufft -lcublas -lcudart -lcusparse >>>> -Wl,-rpath,/opt/MATLAB/R2011a/sys/os/glnxa64:/opt/MATLAB/R2011a/bin/glnxa64:/opt/MATLAB/R2011a/extern/lib/glnxa64 >>>> -L/opt/MATLAB/R2011a/bin/glnxa64 -L/opt/MATLAB/R2011a/extern/lib/glnxa64 >>>> -leng -lmex -lmx -lmat -lut -licudata -licui18n -licuuc >>>> -Wl,-rpath,/opt/epd-7.1-2-rh5-x86_64/lib -L/opt/epd-7.1-2-rh5-x86_64/lib >>>> -lmkl_rt -lmkl_intel_thread -lmkl_core -liomp5 -lexoIIv2for -lexodus >>>> -lnetcdf_c++ -lnetcdf -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.4.3 >>>> -L/usr/lib/gcc/x86_64-linux-gnu/4.4.3 -ldl -lmpich -lopa -lpthread -lrt >>>> -lgcc_s -lmpichf90 -lgfortran -lm -lm -lmpichcxx -lstdc++ -lmpichcxx >>>> -lstdc++ -ldl -lmpich -lopa -lpthread -lrt -lgcc_s -ldl >>>> /bin/rm -f ex52.o ex52_integrateElement.o >>>> >>>> >>>> SCRGP2$ ./ex52 -dim 2 -compute_function -show_residual -batch >>>> Residual: >>>> Vector Object: 1 MPI processes >>>> type: seq >>>> -0.25 >>>> -0.5 >>>> 0.25 >>>> -0.5 >>>> -1 >>>> 0.5 >>>> 0.25 >>>> 0.5 >>>> 0.75 >>>> SCRGP2$ ./ex52 -dim 2 -compute_function -show_residual -batch -gpu >>>> [0]PETSC ERROR: IntegrateElementBatchGPU() line 323 in >>>> src/snes/examples/tutorials/ex52_integrateElement.cu >>>> [0]PETSC ERROR: FormFunctionLocalBatch() line 679 in >>>> src/snes/examples/tutorials/ex52.c >>>> [0]PETSC ERROR: SNESDMComplexComputeFunction() line 431 in >>>> src/snes/utils/damgsnes.c >>>> [0]PETSC ERROR: main() line 1021 in src/snes/examples/tutorials/ex52.c >>>> application called MPI_Abort(MPI_COMM_WORLD, 35) - process 0 >>>> >>> >>> This is failing on cudaMalloc(), which means your card is not available >>> for running. Are you trying to run on your laptop? >>> If so, applications like Preview can lock up the GPU. I know of no way >>> to test this in CUDA while running. I just close >>> apps until it runs. >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> >>>> On Tue, Mar 27, 2012 at 8:37 PM, Matthew Knepley <knepley at >>>> gmail.com>wrote: >>>> >>>>> On Tue, Mar 27, 2012 at 2:10 PM, Blaise Bourdin <bourdin at lsu.edu>wrote: >>>>> >>>>>> >>>>>> On Mar 27, 2012, at 1:23 PM, Matthew Knepley wrote: >>>>>> >>>>>> On Tue, Mar 27, 2012 at 12:58 PM, David Fuentes <fuentesdt at >>>>>> gmail.com>wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I had a question about the status of example 52. >>>>>>> >>>>>>> >>>>>>> http://petsc.cs.iit.edu/petsc/petsc-dev/file/a8e2f2c19319/src/snes/examples/tutorials/ex52.c >>>>>>> >>>>>>> http://petsc.cs.iit.edu/petsc/petsc-dev/file/a8e2f2c19319/src/snes/examples/tutorials/ex52_integrateElement.cu >>>>>>> >>>>>>> >>>>>>> Can this example be used with a DM object created from an >>>>>>> unstructured exodusII mesh, DMMeshCreateExodus, And the FEM assembly >>>>>>> done >>>>>>> on GPU ? >>>>>>> >>>>>> >>>>>> 1) I have pushed many more tests for it now. They can be run using >>>>>> the Python build system >>>>>> >>>>>> ./config/builder2.py check src/snes/examples/tutorials/ex52.c >>>>>> >>>>>> in fact, you can build any set of files this way. >>>>>> >>>>>> 2) The Exodus creation has to be converted to DMComplex from DMMesh. >>>>>> That should not take me very long. Blaise maintains that >>>>>> so maybe there will be help :) You will just replace >>>>>> DMComplexCreateBoxMesh() with DMComplexCreateExodus(). If you request >>>>>> it, I will bump it up the list. >>>>>> >>>>>> >>>>>> DMMeshCreateExodusNG is much more flexible than DMMeshCreateExodus in >>>>>> that it can read meshes with multiple element types and should have a >>>>>> much >>>>>> lower memory footprint. The code should be fairly easy to read. you can >>>>>> email me directly if you have specific questions. I had looked at >>>>>> creating >>>>>> a DMComplex and it did not look too difficult, as long as interpolation >>>>>> is >>>>>> not needed. I have plans to write DMComplexCreateExodus, but haven't had >>>>>> time too so far. Updating the Vec viewers and readers may be a bit more >>>>>> involved. In perfect world, one would write an EXODUS viewer following >>>>>> the >>>>>> lines of the VTK and HDF5 ones. >>>>>> >>>>> >>>>> David and Blaise, I have converted this function, now >>>>> DMComplexCreateExodus(). Its not tested, but I think >>>>> Blaise has some stuff we can use to test it. >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> Blaise >>>>>> >>>>>> >>>>>> >>>>>> Let me know if you can run the tests. >>>>>> >>>>>> Thanks >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> Thanks, >>>>>>> David >>>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which >>>>>> their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>>> >>>>>> -- >>>>>> Department of Mathematics and Center for Computation & Technology >>>>>> Louisiana State University, Baton Rouge, LA 70803, USA >>>>>> Tel. +1 (225) 578 1612, Fax +1 (225) 578 4276 >>>>>> http://www.math.lsu.edu/~bourdin >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120328/4732f615/attachment.htm>