Re: [petsc-users] BAIJCUSPARSE?
Xiangdong, cuSPARSE does support block compressed sparse row (BAIJ) format, but we don't currently support that cuSPARSE functionality in PETSc. It should be easy to add, but we are currently refactoring the way we interface with third party GPU libraries such as cuSPARSE, and it would probably make more sense to add this support after that refactor is done. Do you need this right away, or could it wait maybe a few weeks until this is completed? Best regards, Richard On Fri, Oct 25, 2019 at 1:50 PM Smith, Barry F. via petsc-users mailto:petsc-users@mcs.anl.gov>> wrote: You would need to investigate if the Nvidia cuSPARSE package supports such a format. If it does then it would be reasonably straightforward for you to hook up the required interface from PETSc. If it does not then it is a massive job to provide such code and you should see if any open source packages provide such CUDA support and then you could hook PETSc up to use that. Barry > On Oct 25, 2019, at 3:43 PM, Xiangdong via petsc-users > mailto:petsc-users@mcs.anl.gov>> wrote: > > Can anyone comment on the PETSc's GPU version of Block CSR, say BAIJCUSPARSE? > Does it make sense to have such format on GPU? Is it under development? > > Thank you. > > Xiangdong > > On Wed, Oct 23, 2019 at 11:36 AM Xiangdong > mailto:epsco...@gmail.com>> wrote: > Hello everyone, > > I am wondering whether there is a format BAIJCUSPARSE for Block CSR on GPU. > > In my current code, the Jacobian matrix preallocated and assembled as BAIJ > format. Do I have to rewrite this part of code to preallocate and assemble > the matrix as AIJ in order to use aijcusparse? > > Thank you. > > Xiangdong >
Re: [petsc-users] BAIJCUSPARSE?
Can anyone comment on the PETSc's GPU version of Block CSR, say BAIJCUSPARSE? Does it make sense to have such format on GPU? Is it under development? Thank you. Xiangdong On Wed, Oct 23, 2019 at 11:36 AM Xiangdong wrote: > Hello everyone, > > I am wondering whether there is a format BAIJCUSPARSE for Block CSR on GPU. > > In my current code, the Jacobian matrix preallocated and assembled as BAIJ > format. Do I have to rewrite this part of code to preallocate and assemble > the matrix as AIJ in order to use aijcusparse? > > Thank you. > > Xiangdong > >
Re: [petsc-users] BAIJCUSPARSE?
You would need to investigate if the Nvidia cuSPARSE package supports such a format. If it does then it would be reasonably straightforward for you to hook up the required interface from PETSc. If it does not then it is a massive job to provide such code and you should see if any open source packages provide such CUDA support and then you could hook PETSc up to use that. Barry > On Oct 25, 2019, at 3:43 PM, Xiangdong via petsc-users > wrote: > > Can anyone comment on the PETSc's GPU version of Block CSR, say BAIJCUSPARSE? > Does it make sense to have such format on GPU? Is it under development? > > Thank you. > > Xiangdong > > On Wed, Oct 23, 2019 at 11:36 AM Xiangdong wrote: > Hello everyone, > > I am wondering whether there is a format BAIJCUSPARSE for Block CSR on GPU. > > In my current code, the Jacobian matrix preallocated and assembled as BAIJ > format. Do I have to rewrite this part of code to preallocate and assemble > the matrix as AIJ in order to use aijcusparse? > > Thank you. > > Xiangdong >
Re: [petsc-users] 'Inserting a new nonzero' issue on a reassembled matrix in parallel
Yes ok, so it seems the problem arose in parallel only, as mentioned in my first email. That output confused me, Thibaut On 25/10/2019 15:19, Smith, Barry F. wrote: That is just a write statement in the code. You are trying to put in the 0 value but it is ignoring it as it is suppose to so all is good for the sequential guess. On Oct 25, 2019, at 8:56 AM, Thibaut Appel wrote: Hi Barry, I was referring to: row 0: (0, 1.) row 1: (1, 1.) row 2: (2, 1.) row 3: (3, 1.) row 4: (4, 1.) row 5: (5, 1.) row 6: (6, 1.) row 7: (7, 1.) row 8: (8, 1.) row 9: (9, 1.) row: 0 col: 9 val: 0.00E+00 0.00E+00 The last line. But I was probably mistaken - if it was inserted it would have been row 0: (0, 1.), (9, 0.) on the first line instead? Thibaut On 25/10/2019 14:41, Smith, Barry F. wrote: On Oct 24, 2019, at 5:09 AM, Thibaut Appel wrote: Hi Matthew, Thanks for having a look, your example runs just like mine in Fortran. In serial, the value (0.0,0.0) was inserted whereas it shouldn't have. I'm sorry, I don't see this for the serial case: $ petscmpiexec -n 1 ./ex241f Mat Object: 1 MPI processes type: seqaij row 0: (0, 2.) row 1: (1, 2.) row 2: (2, 2.) row 3: (3, 2.) row 4: (4, 2.) row 5: (5, 2.) row 6: (6, 2.) row 7: (7, 2.) row 8: (8, 2.) row 9: (9, 2.) Where is the "(0.0,0.0) was inserted whereas it shouldn't have."? Barry In parallel, you'll see that an error "Inserting a new nonzero at global row/column" is triggered. In both cases, MatSetValues tries to insert a zero value whereas IGNORE_ZERO_ENTRIES was set. That's what Barry is looking into, if I'm not mistaken. Thibaut On 24/10/2019 02:31, Matthew Knepley wrote: On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel wrote: Hi both, Please find attached a tiny example (in Fortran, sorry Matthew) that - I think - reproduces the problem we mentioned. Let me know. Okay, I converted to C so I could understand, and it runs fine for me: master *:~/Downloads/tmp$ PETSC_ARCH=arch-master-complex-debug make main /PETSc3/petsc/bin/mpicc -o main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -I/PETSc3/petsc/petsc-dev/include -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include -I/PETSc3/petsc/include -I/opt/X11/include`pwd`/main.c /PETSc3/petsc/bin/mpicc -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -o main main.o -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl -lmpichf90 -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 -lstdc++ -ldl master *:~/Downloads/tmp$ ./main After first assembly: Mat Object: 1 MPI processes type: seqaij row 0: (0, 1. + 1. i) row 1: (1, 1. + 1. i) row 2: (2, 1. + 1. i) row 3: (3, 1. + 1. i) row 4: (4, 1. + 1. i) row 5: (5, 1. + 1. i) row 6: (6, 1. + 1. i) row 7: (7, 1. + 1. i) row 8: (8, 1. + 1. i) row 9: (9, 1. + 1. i) After second assembly: Mat Object: 1 MPI processes type: seqaij row 0: (0, 1. + 1. i) row 1: (1, 1. + 1. i) row 2: (2, 1. + 1. i) row 3: (3, 1. + 1. i) row 4: (4, 1. + 1. i) row 5: (5, 1. + 1. i) row 6: (6, 1. + 1. i) row 7: (7, 1. + 1. i) row 8: (8, 1. + 1. i) row 9: (9, 1. + 1. i) row 0 col: 9 val: 0. + 0. i I attach my code. So then I ran your Fortran: /PETSc3/petsc/bin/mpif90 -c -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -Wno-unused-variable -g -I/PETSc3/petsc/petsc-dev/include -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include -I/PETSc3/petsc/include -I/opt/X11/include -o main2.o main2.F90 /PETSc3/petsc/bin/mpif90 -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -Wno-unused-variable -g -o main2 main2.o -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib
Re: [petsc-users] 'Inserting a new nonzero' issue on a reassembled matrix in parallel
That is just a write statement in the code. You are trying to put in the 0 value but it is ignoring it as it is suppose to so all is good for the sequential guess. > On Oct 25, 2019, at 8:56 AM, Thibaut Appel wrote: > > Hi Barry, > > I was referring to: > > row 0: (0, 1.) > row 1: (1, 1.) > row 2: (2, 1.) > row 3: (3, 1.) > row 4: (4, 1.) > row 5: (5, 1.) > row 6: (6, 1.) > row 7: (7, 1.) > row 8: (8, 1.) > row 9: (9, 1.) > row: 0 col: 9 val: 0.00E+00 0.00E+00 > > > The last line. But I was probably mistaken - if it was inserted it would have > been > row 0: (0, 1.), (9, 0.) > > on the first line instead? > > Thibaut > > > > On 25/10/2019 14:41, Smith, Barry F. wrote: >> >> >> > On Oct 24, 2019, at 5:09 AM, Thibaut Appel >> > wrote: >> > >> > Hi Matthew, >> > >> > Thanks for having a look, your example runs just like mine in Fortran. >> > >> > In serial, the value (0.0,0.0) was inserted whereas it shouldn't have. >> >> I'm sorry, I don't see this for the serial case: >> >> $ petscmpiexec -n 1 ./ex241f >> Mat Object: 1 MPI processes >> type: seqaij >> row 0: (0, 2.) >> row 1: (1, 2.) >> row 2: (2, 2.) >> row 3: (3, 2.) >> row 4: (4, 2.) >> row 5: (5, 2.) >> row 6: (6, 2.) >> row 7: (7, 2.) >> row 8: (8, 2.) >> row 9: (9, 2.) >> >> > >> Where is the "(0.0,0.0) was inserted whereas it shouldn't have."? >> >> >> Barry >> >> >> > In parallel, you'll see that an error "Inserting a new nonzero at global >> > row/column" is triggered. >> > >> > In both cases, MatSetValues tries to insert a zero value whereas >> > IGNORE_ZERO_ENTRIES was set. That's what Barry is looking into, if I'm not >> > mistaken. >> > >> > >> > >> > Thibaut >> > >> > >> > >> > On 24/10/2019 02:31, Matthew Knepley wrote: >> >> On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel >> >> wrote: >> >> Hi both, >> >> >> >> Please find attached a tiny example (in Fortran, sorry Matthew) that - I >> >> think - reproduces the problem we mentioned. >> >> >> >> Let me know. >> >> >> >> Okay, I converted to C so I could understand, and it runs fine for me: >> >> >> >> master *:~/Downloads/tmp$ PETSC_ARCH=arch-master-complex-debug make main >> >> /PETSc3/petsc/bin/mpicc -o main.o -c -Wall -Wwrite-strings >> >> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector >> >> -Qunused-arguments -fvisibility=hidden -g3 >> >> -I/PETSc3/petsc/petsc-dev/include >> >> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include >> >> -I/PETSc3/petsc/include -I/opt/X11/include`pwd`/main.c >> >> /PETSc3/petsc/bin/mpicc -Wl,-multiply_defined,suppress >> >> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs >> >> -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall -Wwrite-strings >> >> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector >> >> -Qunused-arguments -fvisibility=hidden -g3 -o main main.o >> >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> >> -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib >> >> -L/opt/X11/lib >> >> -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> >> -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> >> -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 >> >> -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco >> >> -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl >> >> -lmpichf90 -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 >> >> -lstdc++ -ldl >> >> master *:~/Downloads/tmp$ ./main >> >> After first assembly: >> >> Mat Object: 1 MPI processes >> >> type: seqaij >> >> row 0: (0, 1. + 1. i) >> >> row 1: (1, 1. + 1. i) >> >> row 2: (2, 1. + 1. i) >> >> row 3: (3, 1. + 1. i) >> >> row 4: (4, 1. + 1. i) >> >> row 5: (5, 1. + 1. i) >> >> row 6: (6, 1. + 1. i) >> >> row 7: (7, 1. + 1. i) >> >> row 8: (8, 1. + 1. i) >> >> row 9: (9, 1. + 1. i) >> >> After second assembly: >> >> Mat Object: 1 MPI processes >> >> type: seqaij >> >> row 0: (0, 1. + 1. i) >> >> row 1: (1, 1. + 1. i) >> >> row 2: (2, 1. + 1. i) >> >> row 3: (3, 1. + 1. i) >> >> row 4: (4, 1. + 1. i) >> >> row 5: (5, 1. + 1. i) >> >> row 6: (6, 1. + 1. i) >> >> row 7: (7, 1. + 1. i) >> >> row 8: (8, 1. + 1. i) >> >> row 9: (9, 1. + 1. i) >> >> row 0 col: 9 val: 0. + 0. i >> >> >> >> I attach my code. So then I ran your Fortran: >> >> >> >> /PETSc3/petsc/bin/mpif90 -c -Wall -ffree-line-length-0 >> >> -Wno-unused-dummy-argument -Wno-unused-variable -g >> >> -I/PETSc3/petsc/petsc-dev/include >> >> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include >> >> -I/PETSc3/petsc/include -I/opt/X11/include -o main2.o main2.F90 >> >> /PETSc3/petsc/bin/mpif90
Re: [petsc-users] Block preconditioning for 3d problem
> On Oct 25, 2019, at 1:12 AM, Dave Lee wrote: > > Thanks for your thoughts Barry, I will take this on board. Its not entirely > clear to me how to back out the dx updates for the other variables (velocity, > density, temperature) from the SNES at each newton iteration (for the > pressure solve), though i guess i could just do these myself within the > SNESComputeFunction() user routine. I don't understand your algorithm or approach. Are you just doing a full nonlinear solve on (velocity, density, temperature, pressure) but inside the linear solve you want/need to specifically handle "pressure" as part of the preconditioner? > There is no doubt also a mechanism for doing this via FieldSplit that I'm not > familiar with. If this is the case then FieldSplit was originally designed for this purpose. There are lots of examples showing how to use Fieldsplit in a variety of ways. > > I've found a partial solution to my problem, doing a Gauss-seidel two stage > solve for the pressure at each newton iteration, where i do an intermediate > solve based on implicit horizontal gradients, then a final solve (for the > current newton iteration) based on implicit vertical gradients. > > This converges well, but i think there are still time splitting issues with > this uncoupled approach as the full solution goes unstable after a few time > steps (dt >> CFL). > > I will take your advice and look at reformulating my outer problem for a SNES > (line search) solve. > > Cheers, Dave. > > On Fri, 25 Oct. 2019, 2:52 am Smith, Barry F., wrote: > > If you are "throwing things" away in computing the Jacobian then any > expectation that Newton will converge is also thrown out the window. > > If you used the PETSc SNES nonlinear solver you would have much more space > to experiment in. For example you could use -snes_mf_operator to apply the > "true" matrix-vector product (computed with differencing) while building the > preconditioner by "throwing things" away and can still expect convergence of > the nonlinear solver. It also has tools like -snes_jacobian_test that > compare your Jacobian to one computed by PETSc to see if there are bugs in > your Jacobian code and can compute the full Jacobian with differencing and > coloring pretty efficiently. Also a large set of line search algorithms. > > Newton is not simple (though it looks it) and you should bring power Newton > tools to the game, SNES, not home-brew Newton solvers, especially when > tackling problems with potentially interesting nonlinearities. > > Barry > > > > On Oct 14, 2019, at 8:18 PM, Dave Lee wrote: > > > > Hi Barry, > > > > I've replied inline: > > > > On Mon, Oct 14, 2019 at 4:07 PM Smith, Barry F. wrote: > > > > Thanks. It seems that the iterative method is solved the entire linear > > system very accurately. For the last linear solve the initial true residual > > norm is around 10^6 (i.e. the norm of b) and the final one around 10^-10 > > > > The initial true residual norm on each block (part of a slab on one > > process) is around 10^-1 so I don't see a huge difference in scaling > > between the blocks. The final true residual for blocks is around 10^-17 so > > each block is well solved. > > > > Thank you for taking the time to look at this Barry. Glad to know there's > > nothing too crazy going on here. > > > > > >What do you mean by "quasi-Newton"? Are you doing any kind of line > > search or just using the full step from the linear solve? > > > > I'm using a full step from the linear solve. By "quasi-Newton" I just mean > > that the Jacobian is only approximate. I've left out and simplified some > > terms in order to make the Schur complement more tractable > > > > > >When you solve with the single KSP over all slabs the convergence is > > determined by ||r_all||/||b_all|| < tol while slab by slab it is > > ||r_slab||/||b_slab|| < tol for each slab. If the slabs had hugely > > different scalings this could result in different "answers" but since the > > slabs seem to have very similar scalings the "answers" should be very > > similar. > > > >I would do the following. Solve the first linear system both ways and > > then compute the difference in the solution to the linear systems computed > > both ways. Relative to ||b|| the norm of the difference of the two linear > > systems solutions should be 10^-15 indicating both approaches produced > > almost identical answers. If this is the case then the problem is > > elsewhere, not in the solution of the linear system. On the other hand if > > the difference is large then likely the problem is in the formulation of > > the linear system, perhaps the combined linear system is not actually the > > individual linear systems. > > > > Thanks for the tip Barry. I tried this, and the normalised difference > > between the full solve and the slab-wise solve is approximately 1.0e-4 to > > 1.0e-3
Re: [petsc-users] 'Inserting a new nonzero' issue on a reassembled matrix in parallel
Hi Barry, I was referring to: row 0: (0, 1.) row 1: (1, 1.) row 2: (2, 1.) row 3: (3, 1.) row 4: (4, 1.) row 5: (5, 1.) row 6: (6, 1.) row 7: (7, 1.) row 8: (8, 1.) row 9: (9, 1.) *row: 0 col: 9 val: 0.00E+00 0.00E+00* * * The last line. But I was probably mistaken - if it was inserted it would have been *row 0: (0, 1.), (9, 0.)* on the first line instead? Thibaut* * On 25/10/2019 14:41, Smith, Barry F. wrote: > On Oct 24, 2019, at 5:09 AM, Thibaut Appel wrote: > > Hi Matthew, > > Thanks for having a look, your example runs just like mine in Fortran. > > In serial, the value (0.0,0.0) was inserted whereas it shouldn't have. I'm sorry, I don't see this for the serial case: $ petscmpiexec -n 1 ./ex241f Mat Object: 1 MPI processes type: seqaij row 0: (0, 2.) row 1: (1, 2.) row 2: (2, 2.) row 3: (3, 2.) row 4: (4, 2.) row 5: (5, 2.) row 6: (6, 2.) row 7: (7, 2.) row 8: (8, 2.) row 9: (9, 2.) > Where is the "(0.0,0.0) was inserted whereas it shouldn't have."? Barry > In parallel, you'll see that an error "Inserting a new nonzero at global row/column" is triggered. > > In both cases, MatSetValues tries to insert a zero value whereas IGNORE_ZERO_ENTRIES was set. That's what Barry is looking into, if I'm not mistaken. > > > > Thibaut > > > > On 24/10/2019 02:31, Matthew Knepley wrote: >> On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel wrote: >> Hi both, >> >> Please find attached a tiny example (in Fortran, sorry Matthew) that - I think - reproduces the problem we mentioned. >> >> Let me know. >> >> Okay, I converted to C so I could understand, and it runs fine for me: >> >> master *:~/Downloads/tmp$ PETSC_ARCH=arch-master-complex-debug make main >> /PETSc3/petsc/bin/mpicc -o main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -I/PETSc3/petsc/petsc-dev/include -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include -I/PETSc3/petsc/include -I/opt/X11/include `pwd`/main.c >> /PETSc3/petsc/bin/mpicc -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -o main main.o -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl -lmpichf90 -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 -lstdc++ -ldl >> master *:~/Downloads/tmp$ ./main >> After first assembly: >> Mat Object: 1 MPI processes >> type: seqaij >> row 0: (0, 1. + 1. i) >> row 1: (1, 1. + 1. i) >> row 2: (2, 1. + 1. i) >> row 3: (3, 1. + 1. i) >> row 4: (4, 1. + 1. i) >> row 5: (5, 1. + 1. i) >> row 6: (6, 1. + 1. i) >> row 7: (7, 1. + 1. i) >> row 8: (8, 1. + 1. i) >> row 9: (9, 1. + 1. i) >> After second assembly: >> Mat Object: 1 MPI processes >> type: seqaij >> row 0: (0, 1. + 1. i) >> row 1: (1, 1. + 1. i) >> row 2: (2, 1. + 1. i) >> row 3: (3, 1. + 1. i) >> row 4: (4, 1. + 1. i) >> row 5: (5, 1. + 1. i) >> row 6: (6, 1. + 1. i) >> row 7: (7, 1. + 1. i) >> row 8: (8, 1. + 1. i) >> row 9: (9, 1. + 1. i) >> row 0 col: 9 val: 0. + 0. i >> >> I attach my code. So then I ran your Fortran: >> >> /PETSc3/petsc/bin/mpif90 -c -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -Wno-unused-variable -g -I/PETSc3/petsc/petsc-dev/include -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include -I/PETSc3/petsc/include -I/opt/X11/include -o main2.o main2.F90 >> /PETSc3/petsc/bin/mpif90 -Wl,-multiply_defined,suppress -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall -ffree-line-length-0 -Wno-unused-dummy-argument -Wno-unused-variable -g -o main2 main2.o -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib -L/opt/X11/lib -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 -llapack -lblas
Re: [petsc-users] 'Inserting a new nonzero' issue on a reassembled matrix in parallel
> On Oct 24, 2019, at 5:09 AM, Thibaut Appel wrote: > > Hi Matthew, > > Thanks for having a look, your example runs just like mine in Fortran. > > In serial, the value (0.0,0.0) was inserted whereas it shouldn't have. I'm sorry, I don't see this for the serial case: $ petscmpiexec -n 1 ./ex241f Mat Object: 1 MPI processes type: seqaij row 0: (0, 2.) row 1: (1, 2.) row 2: (2, 2.) row 3: (3, 2.) row 4: (4, 2.) row 5: (5, 2.) row 6: (6, 2.) row 7: (7, 2.) row 8: (8, 2.) row 9: (9, 2.) > Where is the "(0.0,0.0) was inserted whereas it shouldn't have."? Barry > In parallel, you'll see that an error "Inserting a new nonzero at global > row/column" is triggered. > > In both cases, MatSetValues tries to insert a zero value whereas > IGNORE_ZERO_ENTRIES was set. That's what Barry is looking into, if I'm not > mistaken. > > > > Thibaut > > > > On 24/10/2019 02:31, Matthew Knepley wrote: >> On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel >> wrote: >> Hi both, >> >> Please find attached a tiny example (in Fortran, sorry Matthew) that - I >> think - reproduces the problem we mentioned. >> >> Let me know. >> >> Okay, I converted to C so I could understand, and it runs fine for me: >> >> master *:~/Downloads/tmp$ PETSC_ARCH=arch-master-complex-debug make main >> /PETSc3/petsc/bin/mpicc -o main.o -c -Wall -Wwrite-strings >> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector >> -Qunused-arguments -fvisibility=hidden -g3 >> -I/PETSc3/petsc/petsc-dev/include >> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include >> -I/PETSc3/petsc/include -I/opt/X11/include`pwd`/main.c >> /PETSc3/petsc/bin/mpicc -Wl,-multiply_defined,suppress -Wl,-multiply_defined >> -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first >> -Wl,-no_compact_unwind -Wall -Wwrite-strings -Wno-strict-aliasing >> -Wno-unknown-pragmas -fstack-protector -Qunused-arguments >> -fvisibility=hidden -g3 -o main main.o >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib >> -L/opt/X11/lib -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 >> -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco >> -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl -lmpichf90 >> -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 -lstdc++ -ldl >> master *:~/Downloads/tmp$ ./main >> After first assembly: >> Mat Object: 1 MPI processes >> type: seqaij >> row 0: (0, 1. + 1. i) >> row 1: (1, 1. + 1. i) >> row 2: (2, 1. + 1. i) >> row 3: (3, 1. + 1. i) >> row 4: (4, 1. + 1. i) >> row 5: (5, 1. + 1. i) >> row 6: (6, 1. + 1. i) >> row 7: (7, 1. + 1. i) >> row 8: (8, 1. + 1. i) >> row 9: (9, 1. + 1. i) >> After second assembly: >> Mat Object: 1 MPI processes >> type: seqaij >> row 0: (0, 1. + 1. i) >> row 1: (1, 1. + 1. i) >> row 2: (2, 1. + 1. i) >> row 3: (3, 1. + 1. i) >> row 4: (4, 1. + 1. i) >> row 5: (5, 1. + 1. i) >> row 6: (6, 1. + 1. i) >> row 7: (7, 1. + 1. i) >> row 8: (8, 1. + 1. i) >> row 9: (9, 1. + 1. i) >> row 0 col: 9 val: 0. + 0. i >> >> I attach my code. So then I ran your Fortran: >> >> /PETSc3/petsc/bin/mpif90 -c -Wall -ffree-line-length-0 >> -Wno-unused-dummy-argument -Wno-unused-variable -g >> -I/PETSc3/petsc/petsc-dev/include >> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include >> -I/PETSc3/petsc/include -I/opt/X11/include -o main2.o main2.F90 >> /PETSc3/petsc/bin/mpif90 -Wl,-multiply_defined,suppress >> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs >> -Wl,-search_paths_first -Wl,-no_compact_unwind -Wall -ffree-line-length-0 >> -Wno-unused-dummy-argument -Wno-unused-variable -g -o main2 main2.o >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib >> -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib >> -L/opt/X11/lib -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 >> -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 >> -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco >> -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl -lmpichf90 >> -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 -lstdc++ -ldl >> master *:~/Downloads/tmp$ ./main2 >> After first assembly: >> Mat Object: 1 MPI processes >> type: seqaij >> row 0: (0, 1.) >> row 1: (1,