It would be nice if we could automatically detect this issue more often. Barry
Begin forwarded message: > From: Mathis Friesdorf <[email protected]> > Subject: Re: [petsc-users] Unexpected "Out of memory error" with SLEPC > Date: June 26, 2014 at 5:08:23 AM CDT > To: Karl Rupp <[email protected]> > Cc: <[email protected]>, <[email protected]> > > Dear Karl, > > thanks a lot! This indeed worked. I recompiled PETSC with > > ./configure --with-64-bit-indices=1 > PETSC_ARCH = arch-linux2-c-debug-int64 > > and the error is gone. Again thanks for your help! > > All the best, Mathis > > > On Wed, Jun 25, 2014 at 12:42 PM, Karl Rupp <[email protected]> wrote: > Hi Mathis, > > this looks very much like an integer overflow: > http://www.mcs.anl.gov/petsc/documentation/faq.html#with-64-bit-indices > > Best regards, > Karli > > > On 06/25/2014 12:31 PM, Mathis Friesdorf wrote: > Dear all, > > after a very useful email exchange with Jed Brown quite a while ago, I > was able to find the lowest eigenvalue of a large matrix which is > constructed as a tensor product. Admittedly the solution is a bit > hacked, but is based on a Matrix shell and Armadillo and therefore > reasonably fast. The problem seems to work well for smaller systems, but > once the vectors reach a certain size, I get "out of memory" errors. I > have tested the initialization of a vector of that size and > multiplication by the matrix. This works fine and takes roughly 20GB of > memory. There are 256 GB available, so I see no reason why the esp > solvers should complain. Does anyone have an idea what goes wrong here? > The error message is not very helpful and claims that a memory is > requested that is way beyond any reasonable number: /Memory requested > 18446744056529684480./ > > > Thanks and all the best, Mathis Friesdorf > > > *Output of the Program:* > /mathis@n180:~/localisation$ ./local_plus 27/ > /System Size: 27 > > -------------------------------------------------------------------------- > [[30558,1],0]: A high-performance Open MPI point-to-point messaging module > was unable to find any relevant network interfaces: > > Module: OpenFabrics (openib) > Host: n180 > > Another transport will be used instead, although this may result in > lower performance. > -------------------------------------------------------------------------- > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Out of memory. This could be due to allocating > [0]PETSC ERROR: too large an object or bleeding by not properly > [0]PETSC ERROR: destroying unneeded objects. > [0]PETSC ERROR: Memory allocated 3221286704 Memory used by process > 3229827072 > [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info. > [0]PETSC ERROR: Memory requested 18446744056529684480! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.4, Mar, 13, 2014 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./local_plus on a arch-linux2-cxx-debug named n180 by > mathis Wed Jun 25 12:23:01 2014 > [0]PETSC ERROR: Libraries linked from /home/mathis/bin_nodebug/lib > [0]PETSC ERROR: Configure run at Wed Jun 25 00:03:34 2014 > [0]PETSC ERROR: Configure options PETSC_DIR=/home/mathis/petsc-3.4.4 > --with-debugging=1 COPTFLAGS="-O3 -march=p4 -mtune=p4" --with-fortran=0 > -with-mpi=1 --with-mpi-dir=/usr/lib/openmpi --with-clanguage=cxx > --prefix=/home/mathis/bin_nodebug > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: PetscMallocAlign() line 46 in > /home/mathis/petsc-3.4.4/src/sys/memory/mal.c > [0]PETSC ERROR: PetscTrMallocDefault() line 189 in > /home/mathis/petsc-3.4.4/src/sys/memory/mtr.c > [0]PETSC ERROR: VecDuplicateVecs_Contiguous() line 62 in > src/vec/contiguous.c > [0]PETSC ERROR: VecDuplicateVecs() line 589 in > /home/mathis/petsc-3.4.4/src/vec/vec/interface/vector.c > [0]PETSC ERROR: EPSAllocateSolution() line 51 in src/eps/interface/mem.c > [0]PETSC ERROR: EPSSetUp_KrylovSchur() line 141 in > src/eps/impls/krylov/krylovschur/krylovschur.c > [0]PETSC ERROR: EPSSetUp() line 147 in src/eps/interface/setup.c > [0]PETSC ERROR: EPSSolve() line 90 in src/eps/interface/solve.c > [0]PETSC ERROR: main() line 48 in local_plus.cpp > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 55. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > > / > *Output of make: > */mathis@n180:~/localisation$ make local_plus > > mpicxx -o local_plus.o -c -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -g -fPIC > -I/home/mathis/armadillo-4.300.8/include -lblas -llapack > -L/home/mathis/armadillo-4.300.8 -O3 -larmadillo -fomit-frame-pointer > -I/home/mathis/bin_nodebug/include -I/home/mathis/bin_nodebug/include > -I/usr/lib/openmpi/include -I/usr/lib/openmpi/include/openmpi > -D__INSDIR__= -I/home/mathis/bin_nodebug > -I/home/mathis/bin_nodebug//include -I/home/mathis/bin_nodebug/include > local_plus.cpp > local_plus.cpp:22:0: warning: "__FUNCT__" redefined [enabled by default] > In file included from /home/mathis/bin_nodebug/include/petscvec.h:10:0, > from local_plus.cpp:10: > /home/mathis/bin_nodebug/include/petscviewer.h:386:0: note: this is the > location of the previous definition > g++ -o local_plus local_plus.o -Wl,-rpath,/home/mathis/bin_nodebug//lib > -L/home/mathis/bin_nodebug//lib -lslepc > -Wl,-rpath,/home/mathis/bin_nodebug/lib -L/home/mathis/bin_nodebug/lib > -lpetsc -llapack -lblas -lpthread -Wl,-rpath,/usr/lib/openmpi/lib > -L/usr/lib/openmpi/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.7 > -L/usr/lib/gcc/x86_64-linux-gnu/4.7 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -lmpi_cxx -lstdc++ -ldl -lmpi -lopen-rte > -lopen-pal -lnsl -lutil -lgcc_s -lpthread -ldl > /bin/rm -f local_plus.o > / > *Code: > *///Author: Mathis Friesdorf > //[email protected] <mailto:[email protected]> > > > static char help[] = "1D chain\n"; > > #include <iostream> > #include <typeinfo> > #include <armadillo> > #include <petscsys.h> > #include <petscvec.h> > #include <petscmat.h> > #include <slepcsys.h> > #include <slepceps.h> > #include <math.h> > #include <assert.h> > > PetscErrorCode BathMult(Mat H, Vec x,Vec y); > PetscInt L=30,d=2,dsys; > PetscErrorCode ierr; > arma::mat hint = "1.0 0 0 0.0; 0 -1.0 2.0 0; 0 2.0 -1.0 0; 0 0 0 1.0;"; > > #define __FUNCT__ "main" > int main(int argc, char **argv) > { > Mat H; > EPS eps; > Vec xr,xi; > PetscScalar kr,ki; > PetscInt j, nconv; > > L = strtol(argv[1],NULL,10); > dsys = pow(d,L); > printf("%s","System Size: "); > printf("%i",L); > printf("%s","\n"); > SlepcInitialize(&argc,&argv,(char*)0,help); > > MatCreateShell(PETSC_COMM_WORLD,dsys,dsys,dsys,dsys,NULL,&H); > MatShellSetOperation(H,MATOP_MULT,(void(*)())BathMult); > ierr = MatGetVecs(H,NULL,&xr); CHKERRQ(ierr); > ierr = MatGetVecs(H,NULL,&xi); CHKERRQ(ierr); > > ierr = EPSCreate(PETSC_COMM_WORLD, &eps); CHKERRQ(ierr); > ierr = EPSSetOperators(eps, H, NULL); CHKERRQ(ierr); > ierr = EPSSetProblemType(eps, EPS_HEP); CHKERRQ(ierr); > ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); CHKERRQ(ierr); > ierr = EPSSetFromOptions( eps ); CHKERRQ(ierr); > ierr = EPSSolve(eps); CHKERRQ(ierr); > ierr = EPSGetConverged(eps, &nconv); CHKERRQ(ierr); > for (j=0; j<1; j++) { > EPSGetEigenpair(eps, j, &kr, &ki, xr, xi); > printf("%s","Lowest Eigenvalue: "); > PetscPrintf(PETSC_COMM_WORLD,"%9F",kr); > PetscPrintf(PETSC_COMM_WORLD,"\n"); > } > EPSDestroy(&eps); > > ierr = SlepcFinalize(); > return 0; > } > #undef __FUNCT__ > > #define __FUNCT__ "BathMult" > PetscErrorCode BathMult(Mat H, Vec x, Vec y) > { > PetscInt l; > uint slice; > PetscScalar *arrayin,*arrayout; > > VecGetArray(x,&arrayin); > VecGetArray(y,&arrayout); > arma::cube A = arma::cube(arrayin,1,1,pow(d,L), > /*copy_aux_mem*/false,/*strict*/true); > arma::mat result = arma::mat(arrayout,pow(d,L),1, > /*copy_aux_mem*/false,/*strict*/true); > for (l=0;l<L-1;l++){ > A.reshape(pow(d,L-2-l),pow(d,2),pow(d,l)); > result.reshape(pow(d,L-l),pow(d,l)); > for (slice=0;slice<A.n_slices;slice++){ > result.col(slice) += vectorise(A.slice(slice)*hint); > } > } > arrayin = A.memptr(); > ierr = VecRestoreArray(x,&arrayin); CHKERRQ(ierr); > arrayout = result.memptr(); > ierr = VecRestoreArray(y,&arrayout); CHKERRQ(ierr); > PetscFunctionReturn(0); > } > #undef __FUNCT__/* > * > >
