On Sun, May 6, 2012 at 9:24 AM, Alexander Grayver <agrayver at gfz-potsdam.de>wrote:
> ** > Hm, valgrind gives a lot of output like that (see full log in previous > message): > Can you run this with --download-f-blas-lapack? This sounds much more like an MKL bug. Matt > ==20287== Invalid read of size 8 > ==20287== at 0x1AE79DA1: mkl_lapack_dlasq3 (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so) > ==20287== by 0x5CF7AE5: mkl_lapack_dlasq3 (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so) > ==20287== by 0x1AE79617: mkl_lapack_dlasq2 (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so) > ==20287== by 0x5CF7A15: mkl_lapack_dlasq2 (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so) > ==20287== by 0x1AA3E72A: mkl_lapack_dlasq1 (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so) > ==20287== by 0x5CF79C7: mkl_lapack_dlasq1 (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so) > ==20287== by 0x1AC44D6C: mkl_lapack_zbdsqr (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so) > ==20287== by 0x5CFFEF8: mkl_lapack_zbdsqr (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so) > ==20287== by 0x1AC7D989: mkl_lapack_zgesvd (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_lapack.so) > ==20287== by 0x5D021C0: mkl_lapack_zgesvd (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_thread.so) > ==20287== by 0x5899E43: ZGESVD (in > /opt/intel/Compiler/11.1/072/mkl/lib/em64t/libmkl_intel_lp64.so) > ==20287== by 0x697017: KSPComputeExtremeSingularValues_GMRES > (gmreig.c:46) > ==20287== by 0x69EFBA: KSPComputeExtremeSingularValues (itfunc.c:47) > ==20287== by 0x4509BC: main (solveTest.c:62) > ==20287== Address 0x11363d48 is not stack'd, malloc'd or (recently) free'd > > > On 06.05.2012 15:21, Alexander Grayver wrote: > > On 06.05.2012 15:07, Matthew Knepley wrote: > > Hello, >>> >>> I use KSP and random rhs to compute largest singular value: >>> >> >> 1) Is this the whole program? If not, this can be caused by memory >> corruption somewhere else. This is what I suspect. >> >> >> Matt, >> >> I can reproduce error using attached test programm and this matrix (7 mb): >> http://dl.dropbox.com/u/60982984/A.dat >> > > I run it fine with the latest petsc-dev: > > 1.405802e+00 > > Can you valgrind it on your machine? > > > I did: > valgrind --tool=memcheck -q --num-callers=20 --log-file=valgrind.log.%p > /solveTest -ksp_monitor_true_residual -log_summary -mat_type aij -ksp_rtol > 1.0e-10 -malloc off > > The error is better constrained: > > ==20287== Invalid read of size 8 > ==20287== at 0x7874B4C: opal_os_path (in > /opt/mpi/intel/openmpi-1.4.2/lib/libopen-pal.so.0.0.0) > ==20287== by 0x75F2E27: orte_session_dir_finalize (in > /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0) > ==20287== by 0x76012E8: orte_errmgr_base_error_abort (in > /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0) > ==20287== by 0x73396E9: ompi_mpi_abort (in > /opt/mpi/intel/openmpi-1.4.2/lib/libmpi.so.0.0.2) > ==20287== by 0x734F36E: PMPI_Abort (in > /opt/mpi/intel/openmpi-1.4.2/lib/libmpi.so.0.0.2) > ==20287== by 0x7499AB: PetscDefaultSignalHandler (signal.c:169) > ==20287== by 0x749267: PetscSignalHandler_Private (signal.c:53) > ==20287== by 0x924B9DF: ??? (in /lib64/libc-2.11.1.so) > ==20287== by 0x535D9E: VecDestroyVecs (vector.c:653) > ==20287== by 0x68B61D: KSPReset_GMRES (gmres.c:258) > ==20287== by 0x6A9D39: KSPReset (itfunc.c:733) > ==20287== by 0x6AA839: KSPDestroy (itfunc.c:780) > ==20287== by 0x4509F8: main (solveTest.c:66) > ==20287== Address 0xbde4860 is 0 bytes inside a block of size 2 alloc'd > ==20287== at 0x4C26B9B: malloc (vg_replace_malloc.c:263) > ==20287== by 0x92876DF: vasprintf (in /lib64/libc-2.11.1.so) > ==20287== by 0x9266C67: asprintf (in /lib64/libc-2.11.1.so) > ==20287== by 0x75F1701: orte_util_convert_vpid_to_string (in > /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0) > ==20287== by 0x75F2D4A: orte_session_dir_finalize (in > /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0) > ==20287== by 0x76012E8: orte_errmgr_base_error_abort (in > /opt/mpi/intel/openmpi-1.4.2/lib/libopen-rte.so.0.0.0) > ==20287== by 0x73396E9: ompi_mpi_abort (in > /opt/mpi/intel/openmpi-1.4.2/lib/libmpi.so.0.0.2) > ==20287== by 0x734F36E: PMPI_Abort (in > /opt/mpi/intel/openmpi-1.4.2/lib/libmpi.so.0.0.2) > ==20287== by 0x7499AB: PetscDefaultSignalHandler (signal.c:169) > ==20287== by 0x749267: PetscSignalHandler_Private (signal.c:53) > ==20287== by 0x924B9DF: ??? (in /lib64/libc-2.11.1.so) > ==20287== by 0x535D9E: VecDestroyVecs (vector.c:653) > ==20287== by 0x68B61D: KSPReset_GMRES (gmres.c:258) > ==20287== by 0x6A9D39: KSPReset (itfunc.c:733) > ==20287== by 0x6AA839: KSPDestroy (itfunc.c:780) > ==20287== by 0x4509F8: main (solveTest.c:66) > > Full log is attached. > > Important. > If I comment this line: > KSPComputeExtremeSingularValues(ksp, &maxx, &minx); > > It works. > > -- > > Regards, > Alexander > > > > -- > Regards, > Alexander > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120506/69670179/attachment.html>