That is just a write statement in the code. You are trying to put in the 0 value but it is ignoring it as it is suppose to so all is good for the sequential guess.
> On Oct 25, 2019, at 8:56 AM, Thibaut Appel <t.appe...@imperial.ac.uk> wrote: > > Hi Barry, > > I was referring to: > > row 0: (0, 1.) > row 1: (1, 1.) > row 2: (2, 1.) > row 3: (3, 1.) > row 4: (4, 1.) > row 5: (5, 1.) > row 6: (6, 1.) > row 7: (7, 1.) > row 8: (8, 1.) > row 9: (9, 1.) > row: 0 col: 9 val: 0.000000000000000000E+00 0.000000000000000000E+00 > > > The last line. But I was probably mistaken - if it was inserted it would have > been > row 0: (0, 1.), (9, 0.) > > on the first line instead? > > Thibaut > > > > On 25/10/2019 14:41, Smith, Barry F. wrote: >> >> >> > On Oct 24, 2019, at 5:09 AM, Thibaut Appel <t.appe...@imperial.ac.uk> >> > wrote: >> > >> > Hi Matthew, >> > >> > Thanks for having a look, your example runs just like mine in Fortran. >> > >> > In serial, the value (0.0,0.0) was inserted whereas it shouldn't have. >> >> I'm sorry, I don't see this for the serial case: >> >> $ petscmpiexec -n 1 ./ex241f >> Mat Object: 1 MPI processes >> type: seqaij >> row 0: (0, 2.) >> row 1: (1, 2.) >> row 2: (2, 2.) >> row 3: (3, 2.) >> row 4: (4, 2.) >> row 5: (5, 2.) >> row 6: (6, 2.) >> row 7: (7, 2.) >> row 8: (8, 2.) >> row 9: (9, 2.) >> >> > >> Where is the "(0.0,0.0) was inserted whereas it shouldn't have."? >> >> >> Barry >> >> >> > In parallel, you'll see that an error "Inserting a new nonzero at global >> > row/column" is triggered. >> > >> > In both cases, MatSetValues tries to insert a zero value whereas >> > IGNORE_ZERO_ENTRIES was set. That's what Barry is looking into, if I'm not >> > mistaken. >> > >> > >> > >> > Thibaut >> > >> > >> > >> > On 24/10/2019 02:31, Matthew Knepley wrote: >> >> On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel <t.appe...@imperial.ac.uk> >> >> wrote: >> >> Hi both, >> >> >> >> Please find attached a tiny example (in Fortran, sorry Matthew) that - I >> >> think - reproduces the problem we mentioned. >> >> >> >> Let me know. >> >> >> >> Okay, I converted to C so I could understand, and it runs fine for me: >> >> >> >> master *:~/Downloads/tmp$ PETSC_ARCH=arch-master-complex-debug make main >> >> /PETSc3/petsc/bin/mpicc -o main.o -c -Wall -Wwrite-strings >> >> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector >> >> -Qunused-arguments -fvisibility=hidden -g3 >> >> -I/PETSc3/petsc/petsc-dev/include >> >> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include >> >> -I/PETSc3/petsc/include -I/opt/X11/include `pwd`/main.c >> >> /PETSc3/petsc/bin/mpicc -Wl,-multiply_defined,suppress >> >> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs >> >> -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall -Wwrite-strings >> >> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector >> >> -Qunused-arguments -fvisibility=hidden -g3 -o main main.o >> >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib >> >> -L/opt/X11/lib >> >> -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> >> -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> >> -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 >> >> -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco >> >> -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl >> >> -lmpichf90 -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 >> >> -lstdc++ -ldl >> >> master *:~/Downloads/tmp$ ./main >> >> After first assembly: >> >> Mat Object: 1 MPI processes >> >> type: seqaij >> >> row 0: (0, 1. + 1. i) >> >> row 1: (1, 1. + 1. i) >> >> row 2: (2, 1. + 1. i) >> >> row 3: (3, 1. + 1. i) >> >> row 4: (4, 1. + 1. i) >> >> row 5: (5, 1. + 1. i) >> >> row 6: (6, 1. + 1. i) >> >> row 7: (7, 1. + 1. i) >> >> row 8: (8, 1. + 1. i) >> >> row 9: (9, 1. + 1. i) >> >> After second assembly: >> >> Mat Object: 1 MPI processes >> >> type: seqaij >> >> row 0: (0, 1. + 1. i) >> >> row 1: (1, 1. + 1. i) >> >> row 2: (2, 1. + 1. i) >> >> row 3: (3, 1. + 1. i) >> >> row 4: (4, 1. + 1. i) >> >> row 5: (5, 1. + 1. i) >> >> row 6: (6, 1. + 1. i) >> >> row 7: (7, 1. + 1. i) >> >> row 8: (8, 1. + 1. i) >> >> row 9: (9, 1. + 1. i) >> >> row 0 col: 9 val: 0. + 0. i >> >> >> >> I attach my code. So then I ran your Fortran: >> >> >> >> /PETSc3/petsc/bin/mpif90 -c -Wall -ffree-line-length-0 >> >> -Wno-unused-dummy-argument -Wno-unused-variable -g >> >> -I/PETSc3/petsc/petsc-dev/include >> >> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include >> >> -I/PETSc3/petsc/include -I/opt/X11/include -o main2.o main2.F90 >> >> /PETSc3/petsc/bin/mpif90 -Wl,-multiply_defined,suppress >> >> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs >> >> -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall >> >> -ffree-line-length-0 -Wno-unused-dummy-argument -Wno-unused-variable -g >> >> -o main2 main2.o >> >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib >> >> -L/opt/X11/lib >> >> -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> >> -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> >> -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 >> >> -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco >> >> -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl >> >> -lmpichf90 -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 >> >> -lstdc++ -ldl >> >> master *:~/Downloads/tmp$ ./main2 >> >> After first assembly: >> >> Mat Object: 1 MPI processes >> >> type: seqaij >> >> row 0: (0, 1.) >> >> row 1: (1, 1.) >> >> row 2: (2, 1.) >> >> row 3: (3, 1.) >> >> row 4: (4, 1.) >> >> row 5: (5, 1.) >> >> row 6: (6, 1.) >> >> row 7: (7, 1.) >> >> row 8: (8, 1.) >> >> row 9: (9, 1.) >> >> After second assembly: >> >> Mat Object: 1 MPI processes >> >> type: seqaij >> >> row 0: (0, 1.) >> >> row 1: (1, 1.) >> >> row 2: (2, 1.) >> >> row 3: (3, 1.) >> >> row 4: (4, 1.) >> >> row 5: (5, 1.) >> >> row 6: (6, 1.) >> >> row 7: (7, 1.) >> >> row 8: (8, 1.) >> >> row 9: (9, 1.) >> >> row: 0 col: 9 val: 0.000000000000000000E+00 0.000000000000000000E+00 >> >> >> >> I am not seeing an error. Am I not running it correctly? >> >> >> >> Thanks, >> >> >> >> MAtt >> >> Thibaut >> >> >> >> >> >> >> >> On 22/10/2019 17:48, Matthew Knepley wrote: >> >>> On Tue, Oct 22, 2019 at 12:43 PM Thibaut Appel via petsc-users >> >>> <petsc-users@mcs.anl.gov> wrote: >> >>> Hi Hong, >> >>> >> >>> Thank you for having a look, I copied/pasted your code snippet into >> >>> ex28.c and the error indeed appears if you change that col[0]. That's >> >>> because you did not allow a new non-zero location in the matrix with the >> >>> option MAT_NEW_NONZERO_LOCATION_ERR. >> >>> >> >>> I spent the day debugging the code and already checked my calls to >> >>> MatSetValues: >> >>> >> >>> For all MatSetValues calls corresponding to the row/col location in the >> >>> error messages in the subsequent assembly, the numerical value >> >>> associated with that row/col was exactly (0.0,0.0) (complex arithmetic) >> >>> so it shouldn't be inserted w.r.t. the option MAT_IGNORE_ZERO_ENTRIES. >> >>> It seems MatSetValues still did it anyway. >> >>> >> >>> Okay, lets solve this problem first. You say that >> >>> >> >>> - You called MatSetOption(A, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE) >> >>> - You called MatSetValues(A, ,,,, ADD_VALUES, ..., val) and val had a >> >>> complex 0 in it >> >>> - PETSc tried to insert the complex 0 >> >>> >> >>> This should be really easy to test in a tiny example. Do you mind making >> >>> it? If its broken, I will fix it. >> >>> >> >>> Thanks, >> >>> >> >>> Matt >> >>> I was able to solve the problem by adding >> >>> MatSetOption(L,MAT_NEW_NONZERO_LOCATIONS,PETSC_FALSE) after my first >> >>> assembly. However I don't know why it fixed it as the manual seems to >> >>> say this is just for efficiency purposes >> >>> >> >>> It "should be specified after the first matrix has been fully assembled. >> >>> This option ensures that certain data structures and communication >> >>> information will be reused (instead of regenerated during successive >> >>> steps, thereby increasing efficiency" >> >>> >> >>> So I'm still puzzled by why I got that error in the first place. Unless >> >>> "regenerated" implies resetting some attributes of the preallocated >> >>> non-zero structure / first assembly? >> >>> >> >>> >> >>> >> >>> Thibaut >> >>> >> >>> >> >>> >> >>> On 22/10/2019 17:06, Zhang, Hong wrote: >> >>>> Thibaut: >> >>>> Check your code on MatSetValues(), likely you set a value "to a new >> >>>> nonzero at global row/column (200, 160) into matrix" L. >> >>>> I tested petsc/src/mat/examples/tests/ex28.c by adding >> >>>> @@ -95,6 +95,26 @@ int main(int argc,char **args) >> >>>> /* Compute numeric factors using same F, then solve */ >> >>>> for (k = 0; k < num_numfac; k++) { >> >>>> /* Get numeric factor of A[k] */ >> >>>> + if (k==0) { >> >>>> + ierr = MatZeroEntries(A[0]);CHKERRQ(ierr); >> >>>> + for (i=rstart; i<rend; i++) { >> >>>> + col[0] = i-1; col[1] = i; col[2] = i+1; >> >>>> + if (i == 0) { >> >>>> + ierr = >> >>>> MatSetValues(A[k],1,&i,2,col+1,value+1,INSERT_VALUES);CHKERRQ(ierr); >> >>>> + } else if (i == N-1) { >> >>>> + ierr = >> >>>> MatSetValues(A[k],1,&i,2,col,value,INSERT_VALUES);CHKERRQ(ierr); >> >>>> + } else { >> >>>> + ierr = >> >>>> MatSetValues(A[k],1,&i,3,col,value,INSERT_VALUES);CHKERRQ(ierr); >> >>>> + } >> >>>> + } >> >>>> + if (!rank) { >> >>>> + i = N - 1; col[0] = N - 1; >> >>>> + ierr = >> >>>> MatSetValues(A[k],1,&i,1,col,value,INSERT_VALUES);CHKERRQ(ierr); >> >>>> + } >> >>>> + ierr = MatAssemblyBegin(A[k],MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >>>> + ierr = MatAssemblyEnd(A[k],MAT_FINAL_ASSEMBLY);CHKERRQ(ierr); >> >>>> + } >> >>>> + >> >>>> >> >>>> It works in both sequential and parallel. If I set col[0] = 0, then I >> >>>> get the same error as yours. >> >>>> Hong >> >>>> Dear PETSc developers, >> >>>> >> >>>> I'm extending a validated matrix preallocation/assembly part of my code >> >>>> to solve multiple linear systems with MUMPS at each iteration of a main >> >>>> loop, following the example src/mat/examples/tests/ex28.c that Hong >> >>>> Zhang added a few weeks ago. The difference is that I'm using just 1 >> >>>> matrix to solve different systems. >> >>>> >> >>>> I'm trying to investigate a nasty bug arising when I try to assemble >> >>>> "for a second time" that MPIAIJ matrix. The issue arises only in >> >>>> parallel, serial works fine. >> >>>> >> >>>> Creating 1 MPIAIJ matrix, preallocating it "perfectly" with the case >> >>>> where I have the fewest zero entries in the non-zero structure, before >> >>>> getting its symbolic factorization. >> >>>> >> >>>> Further in the main loop, I'm solely changing its entries retaining the >> >>>> non-zero structure. >> >>>> >> >>>> Here is the simplified Fortran code I'm using: >> >>>> >> >>>> ! Fill (M,N) case to ensure all non-zero entries are preallocated >> >>>> CALL set_equations(M,N) >> >>>> >> >>>> CALL alloc_matrix(L) >> >>>> ! --> Call MatSeqAIJSetPreallocation/MatMPIAIJSetPreallocation >> >>>> ! --> Sets MAT_IGNORE_ZERO_ENTRIES, MAT_NEW_NONZERO_ALLOCATION_ERR, >> >>>> MAT_NO_OFF_PROC_ENTRIES to true >> >>>> >> >>>> CALL assemble_matrix(L) >> >>>> ! --> Calls MatSetValues with ADD_VALUES >> >>>> ! --> Call MatAssemblyBegin/MatAssemblyEnd >> >>>> >> >>>> ! Tell PETSc that new non-zero insertions in matrix are forbidden >> >>>> CALL MatSetOption(L,MAT_NEW_NONZERO_LOCATION_ERR,PETSC_TRUE,ierr) >> >>>> >> >>>> CALL set_mumps_parameters() >> >>>> >> >>>> ! Get symbolic LU factorization using MUMPS >> >>>> CALL MatGetFactor(L,MATSOLVERMUMPS,MAT_FACTOR_LU,F,ierr) >> >>>> CALL MatGetOrdering(L,MATORDERINGNATURAL,rperm,cperm,ierr) >> >>>> CALL MatLUFactorSymbolic(F,L,rperm,cperm,info,ierr) >> >>>> >> >>>> CALL initialize_right_hand_sides() >> >>>> >> >>>> ! Zero matrix entries >> >>>> CALL MatZeroEntries(L,ierr) >> >>>> >> >>>> ! Main loop >> >>>> DO itr=1, maxitr >> >>>> >> >>>> DO m = 1, M >> >>>> DO n = 1, N >> >>>> >> >>>> CALL set_equations(m,n) >> >>>> CALL assemble_matrix(L) ! ERROR HERE when m=1, n=1, CRASH IN >> >>>> MatSetValues call >> >>>> >> >>>> ! Solving the linear system associated with (m,n) >> >>>> CALL MatLUFactorNumeric(F,L,info,ierr) >> >>>> CALL MatSolve(F,v_rhs(m,n),v_sol(m,n),ierr) >> >>>> >> >>>> ! Process v_rhs's from v_sol's for next iteration >> >>>> >> >>>> CALL MatZeroEntries(L,ierr) >> >>>> >> >>>> END DO >> >>>> END DO >> >>>> >> >>>> END DO >> >>>> >> >>>> >> >>>> >> >>>> Testing on a small case, the error I get is >> >>>> >> >>>> [1]PETSC ERROR: --------------------- Error Message >> >>>> -------------------------------------------------------------- >> >>>> [1]PETSC ERROR: Argument out of range >> >>>> [1]PETSC ERROR: Inserting a new nonzero at global row/column (200, 160) >> >>>> into matrix >> >>>> [1]PETSC ERROR: See >> >>>> https://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >> >>>> shooting. >> >>>> [1]PETSC ERROR: Petsc Release Version 3.12.0, unknown >> >>>> [1]PETSC ERROR: Configure options --PETSC_ARCH=cplx_gcc_debug >> >>>> --with-scalar-type=complex --with-precision=double --with-debugging=1 >> >>>> --with-valgrind=1 --with-debugger=gdb --with-fortran-kernels=1 >> >>>> --download-mpich --download-fblaslapack --download-scalapack >> >>>> --download-metis --download-parmetis --download-ptscotch >> >>>> --download-mumps --download-slepc --COPTFLAGS="-O0 -g" >> >>>> --CXXOPTFLAGS="-O0 -g" --FOPTFLAGS="-O0 -g -fbacktrace" >> >>>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 634 in >> >>>> /home/thibaut/Packages/petsc/src/mat/impls/aij/mpi/mpiaij.c >> >>>> [1]PETSC ERROR: #2 MatSetValues() line 1375 in >> >>>> /home/thibaut/Packages/petsc/src/mat/interface/matrix.c >> >>>> [1]PETSC ERROR: #3 User provided function() line 0 in User file >> >>>> application called MPI_Abort(MPI_COMM_SELF, 63) - process 0 >> >>>> >> >>>> >> >>>> which I don't understand. That element was not in the non-zero >> >>>> structure and wasn't preallocated. I printed the value to be inserted >> >>>> at this location (200,160) and it is exactly >> >>>> (0.0000000000000000,0.0000000000000000) so this entry should not be >> >>>> inserted due to MAT_IGNORE_ZERO_ENTRIES, however it seems it is. I'm >> >>>> using ADD_VALUES in MatSetValues but it is the only call where >> >>>> (200,160) is inserted. >> >>>> >> >>>> >> >>>> >> >>>> - I zero the matrix entries with MatZeroEntries which retains the >> >>>> non-zero structure (checked when I print the matrix) but tried to >> >>>> comment the corresponding calls. >> >>>> >> >>>> - I tried to set MAT_NEW_NONZERO_LOCATION_ERR AND >> >>>> MAT_NEW_NONZERO_ALLOCATION_ERR to PETSC_FALSE without effect. >> >>>> >> >>>> >> >>>> >> >>>> Perhaps there's something fundamentally wrong in my approach, in any >> >>>> case would you have any suggestions to identify the exact problem? >> >>>> >> >>>> Using PETSc 3.12.0. Thanks for your support, >> >>>> >> >>>> >> >>>> >> >>>> Thibaut >> >>>> >> >>> >> >>> >> >>> -- >> >>> What most experimenters take for granted before they begin their >> >>> experiments is infinitely more interesting than any results to which >> >>> their experiments lead. >> >>> -- Norbert Wiener >> >>> >> >>> https://www.cse.buffalo.edu/~knepley/ >> >> >> >> >> >> -- >> >> What most experimenters take for granted before they begin their >> >> experiments is infinitely more interesting than any results to which >> >> their experiments lead. >> >> -- Norbert Wiener >> >> >> >> https://www.cse.buffalo.edu/~knepley/ >>