> Lets ignore 'Sun Grid Engine environment' initially and just figureout > your PETSc install. > > - What MPI is it built with? Send us the output for the compile of ex19 > > - you claim 'make test' worked fine - i.e this example ran fine > paralley. can you confrim this with manual run? > > [if thats the case - then PETSc would be working correctly with the > MPI specified] > > >>From the info below -- the example crashes happen only in 'Sun Grid > Engine environment' What is that? And why should binaries compiled > with this default 'MPI' - work in that grid enviornment - without > recompiling with a different 'sun-grid-mpi' ? > > > Satish
Someone else using the PISM software, over in Alaska as it happens, which sits on top of PETSc here, has seen similar errors so I am thinking that it may not be just my environment, which I doubt matches theirs. For your consideration though: =========== My PETSc was built against OpenMPI 1.4 =========== Compilation of the example in question shows: $ export PETSC_DIR=/vol/grid/pkg/petsc-3.0.0-p7 $gmake ex19 mpicc -o ex19.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -I/vol/grid/pkg/petsc-3.0.0-p7/include -I/vol/grid/pkg/petsc-3.0.0-p7/include -I/usr/pkg/include -D__SDIR__="src/snes/examples/tutorials/" ex19.c mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex19 ex19.o -Wl,-rpath,/vol/grid/pkg/petsc-3.0.0-p7/lib -L/vol/grid/pkg/petsc-3.0.0-p7/lib -lpetscsnes -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc -L/usr/pkg/lib -lX11 -llapack -lblas -L/usr/pkg/lib -lmpi -lopen-rte -lopen-pal -lutil -lpthread -lgcc_eh -Wl,-rpath,/usr/pkg/lib -lmpi_f77 -lf95 -lm -lm -L/usr/pkg/lib/gcc-lib/i386--netbsdelf/4.0.3 -L/lib -lm -lm -lmpi_cxx -lstdc++ -lgcc_s -lmpi_cxx -lstdc++ -lgcc_s -lmpi -lopen-rte -lopen-pal -lutil -lpthread -lgcc_eh /bin/rm -f ex19.o $./ex19 lid velocity = 0.0204082, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 lid velocity = 0.0204082, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 ========== Running a parallel invocation local to one machine $ mpirun -n 2 ./ex19 -dmmg_nlevels 4 lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 $ mpirun -n 4 ./ex19 -dmmg_nlevels 4 lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 however when submitting within the SGE environment, we see a similar story to that seen with the PISM package % qsub -pe kmbmpi 2 my_mpirun_job.sh % cat my_mpirun_job.sh.o425710 lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 lid velocity = 0.0016, prandtl # = 1, grashof # = 1 Number of Newton iterations = 2 % qsub -pe kmbmpi 4 my_mpirun_job.sh A swathe of PETSc errors. ========== -- Kevin M. Buckley Room: CO327 School of Engineering and Phone: +64 4 463 5971 Computer Science Victoria University of Wellington New Zealand
