Re: [petsc-users] BAIJCUSPARSE?

2019-10-25 Thread Mills, Richard Tran via petsc-users
Xiangdong,

cuSPARSE does support block compressed sparse row (BAIJ) format, but we don't 
currently support that cuSPARSE functionality in PETSc. It should be easy to 
add, but we are currently refactoring the way we interface with third party GPU 
libraries such as cuSPARSE, and it would probably make more sense to add this 
support after that refactor is done. Do you need this right away, or could it 
wait maybe a few weeks until this is completed?

Best regards,
Richard

On Fri, Oct 25, 2019 at 1:50 PM Smith, Barry F. via petsc-users 
mailto:petsc-users@mcs.anl.gov>> wrote:

  You would need to investigate if the Nvidia cuSPARSE package supports such a 
format. If it does then it would be reasonably straightforward for you to hook 
up the required interface from PETSc. If it does not then it is a massive job 
to provide such code and you should see if any open source packages provide 
such CUDA support and then you could hook PETSc up to use that.

  Barry


> On Oct 25, 2019, at 3:43 PM, Xiangdong via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> wrote:
>
> Can anyone comment on the PETSc's GPU version of Block CSR, say BAIJCUSPARSE? 
> Does it make sense to have such format on GPU? Is it under development?
>
> Thank you.
>
> Xiangdong
>
> On Wed, Oct 23, 2019 at 11:36 AM Xiangdong 
> mailto:epsco...@gmail.com>> wrote:
> Hello everyone,
>
> I am wondering whether there is a format BAIJCUSPARSE for Block CSR on GPU.
>
> In my current code, the Jacobian matrix preallocated and assembled as BAIJ 
> format. Do I have to rewrite this part of code to preallocate and assemble 
> the matrix as AIJ in order to use aijcusparse?
>
> Thank you.
>
> Xiangdong
>



Re: [petsc-users] BAIJCUSPARSE?

2019-10-25 Thread Xiangdong via petsc-users
Can anyone comment on the PETSc's GPU version of Block CSR, say
BAIJCUSPARSE? Does it make sense to have such format on GPU? Is it under
development?

Thank you.

Xiangdong

On Wed, Oct 23, 2019 at 11:36 AM Xiangdong  wrote:

> Hello everyone,
>
> I am wondering whether there is a format BAIJCUSPARSE for Block CSR on GPU.
>
> In my current code, the Jacobian matrix preallocated and assembled as BAIJ
> format. Do I have to rewrite this part of code to preallocate and assemble
> the matrix as AIJ in order to use aijcusparse?
>
> Thank you.
>
> Xiangdong
>
>


Re: [petsc-users] BAIJCUSPARSE?

2019-10-25 Thread Smith, Barry F. via petsc-users


  You would need to investigate if the Nvidia cuSPARSE package supports such a 
format. If it does then it would be reasonably straightforward for you to hook 
up the required interface from PETSc. If it does not then it is a massive job 
to provide such code and you should see if any open source packages provide 
such CUDA support and then you could hook PETSc up to use that.

  Barry


> On Oct 25, 2019, at 3:43 PM, Xiangdong via petsc-users 
>  wrote:
> 
> Can anyone comment on the PETSc's GPU version of Block CSR, say BAIJCUSPARSE? 
> Does it make sense to have such format on GPU? Is it under development?
> 
> Thank you.
> 
> Xiangdong
> 
> On Wed, Oct 23, 2019 at 11:36 AM Xiangdong  wrote:
> Hello everyone,
> 
> I am wondering whether there is a format BAIJCUSPARSE for Block CSR on GPU.
> 
> In my current code, the Jacobian matrix preallocated and assembled as BAIJ 
> format. Do I have to rewrite this part of code to preallocate and assemble 
> the matrix as AIJ in order to use aijcusparse?
> 
> Thank you.
> 
> Xiangdong
> 



Re: [petsc-users] 'Inserting a new nonzero' issue on a reassembled matrix in parallel

2019-10-25 Thread Thibaut Appel via petsc-users
Yes ok, so it seems the problem arose in parallel only, as mentioned in 
my first email. That output confused me,


Thibaut

On 25/10/2019 15:19, Smith, Barry F. wrote:

   That is just a write statement in the code. You are trying to put in the 0 
value but it is ignoring it as it is suppose to so all is good for the 
sequential guess.




On Oct 25, 2019, at 8:56 AM, Thibaut Appel  wrote:

Hi Barry,

I was referring to:

row 0: (0, 1.)
row 1: (1, 1.)
row 2: (2, 1.)
row 3: (3, 1.)
row 4: (4, 1.)
row 5: (5, 1.)
row 6: (6, 1.)
row 7: (7, 1.)
row 8: (8, 1.)
row 9: (9, 1.)
row:  0 col:  9 val:  0.00E+00  0.00E+00


The last line. But I was probably mistaken - if it was inserted it would have 
been
row 0: (0, 1.), (9, 0.)

on the first line instead?

Thibaut



On 25/10/2019 14:41, Smith, Barry F. wrote:



On Oct 24, 2019, at 5:09 AM, Thibaut Appel  wrote:

Hi Matthew,

Thanks for having a look, your example runs just like mine in Fortran.

In serial, the value (0.0,0.0) was inserted whereas it shouldn't have.

 I'm sorry, I don't see this for the serial case:

$ petscmpiexec -n 1 ./ex241f
Mat Object: 1 MPI processes
   type: seqaij
row 0: (0, 2.)
row 1: (1, 2.)
row 2: (2, 2.)
row 3: (3, 2.)
row 4: (4, 2.)
row 5: (5, 2.)
row 6: (6, 2.)
row 7: (7, 2.)
row 8: (8, 2.)
row 9: (9, 2.)

Where is the "(0.0,0.0) was inserted whereas it shouldn't have."?


Barry



In parallel, you'll see that an error "Inserting a new nonzero at global 
row/column" is triggered.

In both cases, MatSetValues tries to insert a zero value whereas 
IGNORE_ZERO_ENTRIES was set. That's what Barry is looking into, if I'm not 
mistaken.



Thibaut



On 24/10/2019 02:31, Matthew Knepley wrote:

On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel  wrote:
Hi both,

Please find attached a tiny example (in Fortran, sorry Matthew) that - I think 
- reproduces the problem we mentioned.

Let me know.

Okay, I converted to C so I could understand, and it runs fine for me:

master *:~/Downloads/tmp$ PETSC_ARCH=arch-master-complex-debug make main
/PETSc3/petsc/bin/mpicc -o main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing 
-Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden 
-g3   -I/PETSc3/petsc/petsc-dev/include 
-I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include 
-I/PETSc3/petsc/include -I/opt/X11/include`pwd`/main.c
/PETSc3/petsc/bin/mpicc -Wl,-multiply_defined,suppress -Wl,-multiply_defined 
-Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first 
-Wl,-no_compact_unwind  -Wall -Wwrite-strings -Wno-strict-aliasing 
-Wno-unknown-pragmas -fstack-protector -Qunused-arguments -fvisibility=hidden 
-g3  -o main main.o 
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib 
-L/opt/X11/lib -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
-L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 -Wl,-rpath,/usr/local/lib 
-L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 -llapack -lblas -lhdf5hl_fortran 
-lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco -lparmetis -lmetis -ltriangle -lz -lX11 
-lctetgen -lstdc++ -ldl -lmpichf90 -lpmpich -lmpich -lopa -lmpl -lpthread 
-lgfortran -lgcc_s.10.5 -lstdc++ -ldl
master *:~/Downloads/tmp$ ./main
After first assembly:
Mat Object: 1 MPI processes
   type: seqaij
row 0: (0, 1. + 1. i)
row 1: (1, 1. + 1. i)
row 2: (2, 1. + 1. i)
row 3: (3, 1. + 1. i)
row 4: (4, 1. + 1. i)
row 5: (5, 1. + 1. i)
row 6: (6, 1. + 1. i)
row 7: (7, 1. + 1. i)
row 8: (8, 1. + 1. i)
row 9: (9, 1. + 1. i)
After second assembly:
Mat Object: 1 MPI processes
   type: seqaij
row 0: (0, 1. + 1. i)
row 1: (1, 1. + 1. i)
row 2: (2, 1. + 1. i)
row 3: (3, 1. + 1. i)
row 4: (4, 1. + 1. i)
row 5: (5, 1. + 1. i)
row 6: (6, 1. + 1. i)
row 7: (7, 1. + 1. i)
row 8: (8, 1. + 1. i)
row 9: (9, 1. + 1. i)
row 0 col: 9 val: 0. + 0. i

I attach my code.  So then I ran your Fortran:

/PETSc3/petsc/bin/mpif90 -c -Wall -ffree-line-length-0 
-Wno-unused-dummy-argument -Wno-unused-variable -g
-I/PETSc3/petsc/petsc-dev/include 
-I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include 
-I/PETSc3/petsc/include -I/opt/X11/include  -o main2.o main2.F90
/PETSc3/petsc/bin/mpif90 -Wl,-multiply_defined,suppress -Wl,-multiply_defined 
-Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first 
-Wl,-no_compact_unwind  -Wall -ffree-line-length-0 -Wno-unused-dummy-argument 
-Wno-unused-variable -g   -o main2 main2.o 
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib 

Re: [petsc-users] 'Inserting a new nonzero' issue on a reassembled matrix in parallel

2019-10-25 Thread Smith, Barry F. via petsc-users


  That is just a write statement in the code. You are trying to put in the 0 
value but it is ignoring it as it is suppose to so all is good for the 
sequential guess.



> On Oct 25, 2019, at 8:56 AM, Thibaut Appel  wrote:
> 
> Hi Barry,
> 
> I was referring to:
> 
> row 0: (0, 1.) 
> row 1: (1, 1.) 
> row 2: (2, 1.)
> row 3: (3, 1.) 
> row 4: (4, 1.) 
> row 5: (5, 1.) 
> row 6: (6, 1.) 
> row 7: (7, 1.) 
> row 8: (8, 1.) 
> row 9: (9, 1.) 
> row:  0 col:  9 val:  0.00E+00  0.00E+00
> 
> 
> The last line. But I was probably mistaken - if it was inserted it would have 
> been 
> row 0: (0, 1.), (9, 0.)
> 
> on the first line instead?
> 
> Thibaut
> 
> 
> 
> On 25/10/2019 14:41, Smith, Barry F. wrote:
>> 
>> 
>> > On Oct 24, 2019, at 5:09 AM, Thibaut Appel  
>> > wrote:
>> > 
>> > Hi Matthew,
>> > 
>> > Thanks for having a look, your example runs just like mine in Fortran.
>> > 
>> > In serial, the value (0.0,0.0) was inserted whereas it shouldn't have.
>> 
>> I'm sorry, I don't see this for the serial case: 
>> 
>> $ petscmpiexec -n 1 ./ex241f
>> Mat Object: 1 MPI processes
>>   type: seqaij
>> row 0: (0, 2.) 
>> row 1: (1, 2.) 
>> row 2: (2, 2.) 
>> row 3: (3, 2.) 
>> row 4: (4, 2.) 
>> row 5: (5, 2.) 
>> row 6: (6, 2.) 
>> row 7: (7, 2.) 
>> row 8: (8, 2.) 
>> row 9: (9, 2.) 
>> 
>> > 
>> Where is the "(0.0,0.0) was inserted whereas it shouldn't have."? 
>> 
>> 
>> Barry
>> 
>> 
>> > In parallel, you'll see that an error "Inserting a new nonzero at global 
>> > row/column" is triggered.
>> > 
>> > In both cases, MatSetValues tries to insert a zero value whereas 
>> > IGNORE_ZERO_ENTRIES was set. That's what Barry is looking into, if I'm not 
>> > mistaken.
>> > 
>> > 
>> > 
>> > Thibaut
>> > 
>> > 
>> > 
>> > On 24/10/2019 02:31, Matthew Knepley wrote:
>> >> On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel  
>> >> wrote:
>> >> Hi both,
>> >> 
>> >> Please find attached a tiny example (in Fortran, sorry Matthew) that - I 
>> >> think - reproduces the problem we mentioned.
>> >> 
>> >> Let me know.
>> >> 
>> >> Okay, I converted to C so I could understand, and it runs fine for me:
>> >> 
>> >> master *:~/Downloads/tmp$ PETSC_ARCH=arch-master-complex-debug make main
>> >> /PETSc3/petsc/bin/mpicc -o main.o -c -Wall -Wwrite-strings 
>> >> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector 
>> >> -Qunused-arguments -fvisibility=hidden -g3   
>> >> -I/PETSc3/petsc/petsc-dev/include 
>> >> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include 
>> >> -I/PETSc3/petsc/include -I/opt/X11/include`pwd`/main.c
>> >> /PETSc3/petsc/bin/mpicc -Wl,-multiply_defined,suppress 
>> >> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs 
>> >> -Wl,-search_paths_first -Wl,-no_compact_unwind  -Wall -Wwrite-strings 
>> >> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector 
>> >> -Qunused-arguments -fvisibility=hidden -g3  -o main main.o 
>> >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> >> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> >> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> >> -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib 
>> >> -L/opt/X11/lib 
>> >> -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
>> >> -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
>> >> -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 
>> >> -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco 
>> >> -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl 
>> >> -lmpichf90 -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 
>> >> -lstdc++ -ldl
>> >> master *:~/Downloads/tmp$ ./main
>> >> After first assembly:
>> >> Mat Object: 1 MPI processes
>> >>   type: seqaij
>> >> row 0: (0, 1. + 1. i)
>> >> row 1: (1, 1. + 1. i)
>> >> row 2: (2, 1. + 1. i)
>> >> row 3: (3, 1. + 1. i)
>> >> row 4: (4, 1. + 1. i)
>> >> row 5: (5, 1. + 1. i)
>> >> row 6: (6, 1. + 1. i)
>> >> row 7: (7, 1. + 1. i)
>> >> row 8: (8, 1. + 1. i)
>> >> row 9: (9, 1. + 1. i)
>> >> After second assembly:
>> >> Mat Object: 1 MPI processes
>> >>   type: seqaij
>> >> row 0: (0, 1. + 1. i)
>> >> row 1: (1, 1. + 1. i)
>> >> row 2: (2, 1. + 1. i)
>> >> row 3: (3, 1. + 1. i)
>> >> row 4: (4, 1. + 1. i)
>> >> row 5: (5, 1. + 1. i)
>> >> row 6: (6, 1. + 1. i)
>> >> row 7: (7, 1. + 1. i)
>> >> row 8: (8, 1. + 1. i)
>> >> row 9: (9, 1. + 1. i)
>> >> row 0 col: 9 val: 0. + 0. i
>> >> 
>> >> I attach my code.  So then I ran your Fortran:
>> >> 
>> >> /PETSc3/petsc/bin/mpif90 -c -Wall -ffree-line-length-0 
>> >> -Wno-unused-dummy-argument -Wno-unused-variable -g
>> >> -I/PETSc3/petsc/petsc-dev/include 
>> >> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include 
>> >> -I/PETSc3/petsc/include -I/opt/X11/include  -o main2.o main2.F90
>> >> /PETSc3/petsc/bin/mpif90 

Re: [petsc-users] Block preconditioning for 3d problem

2019-10-25 Thread Smith, Barry F. via petsc-users



> On Oct 25, 2019, at 1:12 AM, Dave Lee  wrote:
> 
> Thanks for your thoughts Barry, I will take this on board. Its not entirely 
> clear to me how to back out the dx updates for the other variables (velocity, 
> density, temperature) from the SNES at each newton iteration (for the 
> pressure solve), though i guess i could just do these myself within the 
> SNESComputeFunction() user routine.

   I don't understand your algorithm or approach. Are you just doing a full 
nonlinear solve on (velocity, density, temperature, pressure) but inside the 
linear solve you want/need to specifically handle "pressure" as part of the 
preconditioner? 

> There is no doubt also a mechanism for doing this via FieldSplit that I'm not 
> familiar with.

   If this is the case then FieldSplit was originally designed for this 
purpose. There are lots of examples showing how to use Fieldsplit in a variety 
of ways. 

> 
> I've found a partial solution to my problem, doing a Gauss-seidel two stage 
> solve for the pressure at each newton iteration, where i do an intermediate 
> solve based on implicit horizontal gradients, then a final solve (for the 
> current newton iteration) based on implicit vertical gradients.
> 
> This converges well, but i think there are still time splitting issues with 
> this uncoupled approach as the full solution goes unstable after a few time 
> steps (dt >> CFL).
> 
> I will take your advice and look at reformulating my outer problem for a SNES 
> (line search) solve.
> 
> Cheers, Dave.
> 
> On Fri, 25 Oct. 2019, 2:52 am Smith, Barry F.,  wrote:
> 
>   If you are "throwing things" away in computing the Jacobian then any 
> expectation that Newton will converge is also thrown out the window. 
> 
>   If you used the PETSc SNES nonlinear solver you would have much more space 
> to experiment in. For example you could use -snes_mf_operator to apply the 
> "true" matrix-vector product (computed with differencing) while building the 
> preconditioner by "throwing things" away and can still expect convergence of 
> the nonlinear solver.  It also has tools like -snes_jacobian_test that 
> compare your Jacobian to one computed by PETSc to see if there are bugs in 
> your Jacobian code and can compute the full Jacobian with differencing and 
> coloring pretty efficiently.  Also a large set of line search algorithms.
> 
>   Newton is not simple (though it looks it) and you should bring power Newton 
> tools to the game, SNES, not home-brew Newton solvers, especially when 
> tackling problems with potentially interesting nonlinearities.
> 
>  Barry
> 
> 
> > On Oct 14, 2019, at 8:18 PM, Dave Lee  wrote:
> > 
> > Hi Barry,
> > 
> > I've replied inline:
> > 
> > On Mon, Oct 14, 2019 at 4:07 PM Smith, Barry F.  wrote:
> > 
> >   Thanks. It seems that the iterative method is solved the entire linear 
> > system very accurately. For the last linear solve the initial true residual 
> > norm is around 10^6 (i.e. the norm of b) and the final one around 10^-10
> > 
> >   The initial  true residual norm on each block (part of a slab on one 
> > process) is around 10^-1 so I don't see a huge difference in scaling 
> > between the blocks. The final true residual for blocks is around 10^-17 so 
> > each block is well solved.
> > 
> > Thank you for taking the time to look at this Barry. Glad to know there's 
> > nothing too crazy going on here.
> >  
> > 
> >What do you mean by "quasi-Newton"? Are you doing any kind of line 
> > search or just using the full step from the linear solve? 
> > 
> > I'm using a full step from the linear solve. By "quasi-Newton" I just mean 
> > that the Jacobian is only approximate. I've left out and simplified some 
> > terms in order to make the Schur complement more tractable
> >  
> > 
> >When you solve with the single KSP over all slabs the convergence is 
> > determined by ||r_all||/||b_all|| < tol while slab by slab it is 
> > ||r_slab||/||b_slab|| < tol for each slab. If the slabs had hugely 
> > different scalings this could result in different "answers" but since the 
> > slabs seem to have very similar scalings the "answers" should be very 
> > similar. 
> > 
> >I would do the following. Solve the first linear system both ways and 
> > then compute the difference in the solution to the linear systems computed 
> > both ways. Relative to ||b|| the norm of the difference of the two linear 
> > systems solutions should be 10^-15 indicating both approaches produced 
> > almost identical answers. If this is the case then the problem is 
> > elsewhere, not in the solution of the linear system. On the other hand if 
> > the difference is large then likely the problem is in the formulation of 
> > the linear system, perhaps the combined linear system is not actually the 
> > individual linear systems.
> > 
> > Thanks for the tip Barry. I tried this, and the normalised difference 
> > between the full solve and the slab-wise solve is approximately 1.0e-4 to 
> > 1.0e-3 

Re: [petsc-users] 'Inserting a new nonzero' issue on a reassembled matrix in parallel

2019-10-25 Thread Thibaut Appel via petsc-users

Hi Barry,

I was referring to:

row 0: (0, 1.)
row 1: (1, 1.)
row 2: (2, 1.)
row 3: (3, 1.)
row 4: (4, 1.)
row 5: (5, 1.)
row 6: (6, 1.)
row 7: (7, 1.)
row 8: (8, 1.)
row 9: (9, 1.)
*row:  0 col:  9 val: 0.00E+00  0.00E+00*

*
*

The last line. But I was probably mistaken - if it was inserted it would 
have been


*row 0: (0, 1.), (9, 0.)*

on the first line instead?

Thibaut*
*



On 25/10/2019 14:41, Smith, Barry F. wrote:



> On Oct 24, 2019, at 5:09 AM, Thibaut Appel 
 wrote:

>
> Hi Matthew,
>
> Thanks for having a look, your example runs just like mine in Fortran.
>
> In serial, the value (0.0,0.0) was inserted whereas it shouldn't have.

    I'm sorry, I don't see this for the serial case:

$ petscmpiexec -n 1 ./ex241f
Mat Object: 1 MPI processes
  type: seqaij
row 0: (0, 2.)
row 1: (1, 2.)
row 2: (2, 2.)
row 3: (3, 2.)
row 4: (4, 2.)
row 5: (5, 2.)
row 6: (6, 2.)
row 7: (7, 2.)
row 8: (8, 2.)
row 9: (9, 2.)

>
Where is the "(0.0,0.0) was inserted whereas it shouldn't have."?


Barry


> In parallel, you'll see that an error "Inserting a new nonzero at 
global row/column" is triggered.

>
> In both cases, MatSetValues tries to insert a zero value whereas 
IGNORE_ZERO_ENTRIES was set. That's what Barry is looking into, if I'm 
not mistaken.

>
>
>
> Thibaut
>
>
>
> On 24/10/2019 02:31, Matthew Knepley wrote:
>> On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel 
 wrote:

>> Hi both,
>>
>> Please find attached a tiny example (in Fortran, sorry Matthew) 
that - I think - reproduces the problem we mentioned.

>>
>> Let me know.
>>
>> Okay, I converted to C so I could understand, and it runs fine for me:
>>
>> master *:~/Downloads/tmp$ PETSC_ARCH=arch-master-complex-debug make 
main
>> /PETSc3/petsc/bin/mpicc -o main.o -c -Wall -Wwrite-strings 
-Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector 
-Qunused-arguments -fvisibility=hidden -g3   
-I/PETSc3/petsc/petsc-dev/include 
-I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include 
-I/PETSc3/petsc/include -I/opt/X11/include    `pwd`/main.c
>> /PETSc3/petsc/bin/mpicc -Wl,-multiply_defined,suppress 
-Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs 
-Wl,-search_paths_first -Wl,-no_compact_unwind  -Wall -Wwrite-strings 
-Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector 
-Qunused-arguments -fvisibility=hidden -g3  -o main main.o 
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib 
-Wl,-rpath,/opt/X11/lib -L/opt/X11/lib 
-Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
-L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
-Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 
-llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 
-lchaco -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ 
-ldl -lmpichf90 -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran 
-lgcc_s.10.5 -lstdc++ -ldl

>> master *:~/Downloads/tmp$ ./main
>> After first assembly:
>> Mat Object: 1 MPI processes
>>   type: seqaij
>> row 0: (0, 1. + 1. i)
>> row 1: (1, 1. + 1. i)
>> row 2: (2, 1. + 1. i)
>> row 3: (3, 1. + 1. i)
>> row 4: (4, 1. + 1. i)
>> row 5: (5, 1. + 1. i)
>> row 6: (6, 1. + 1. i)
>> row 7: (7, 1. + 1. i)
>> row 8: (8, 1. + 1. i)
>> row 9: (9, 1. + 1. i)
>> After second assembly:
>> Mat Object: 1 MPI processes
>>   type: seqaij
>> row 0: (0, 1. + 1. i)
>> row 1: (1, 1. + 1. i)
>> row 2: (2, 1. + 1. i)
>> row 3: (3, 1. + 1. i)
>> row 4: (4, 1. + 1. i)
>> row 5: (5, 1. + 1. i)
>> row 6: (6, 1. + 1. i)
>> row 7: (7, 1. + 1. i)
>> row 8: (8, 1. + 1. i)
>> row 9: (9, 1. + 1. i)
>> row 0 col: 9 val: 0. + 0. i
>>
>> I attach my code.  So then I ran your Fortran:
>>
>> /PETSc3/petsc/bin/mpif90 -c -Wall -ffree-line-length-0 
-Wno-unused-dummy-argument -Wno-unused-variable -g 
-I/PETSc3/petsc/petsc-dev/include 
-I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include 
-I/PETSc3/petsc/include -I/opt/X11/include  -o main2.o main2.F90
>> /PETSc3/petsc/bin/mpif90 -Wl,-multiply_defined,suppress 
-Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs 
-Wl,-search_paths_first -Wl,-no_compact_unwind  -Wall 
-ffree-line-length-0 -Wno-unused-dummy-argument -Wno-unused-variable 
-g   -o main2 main2.o 
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
-Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib 
-Wl,-rpath,/opt/X11/lib -L/opt/X11/lib 
-Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
-L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
-Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 
-llapack -lblas 

Re: [petsc-users] 'Inserting a new nonzero' issue on a reassembled matrix in parallel

2019-10-25 Thread Smith, Barry F. via petsc-users


> On Oct 24, 2019, at 5:09 AM, Thibaut Appel  wrote:
>
> Hi Matthew,
>
> Thanks for having a look, your example runs just like mine in Fortran.
>
> In serial, the value (0.0,0.0) was inserted whereas it shouldn't have.

I'm sorry, I don't see this for the serial case:

$ petscmpiexec -n 1 ./ex241f
Mat Object: 1 MPI processes
  type: seqaij
row 0: (0, 2.)
row 1: (1, 2.)
row 2: (2, 2.)
row 3: (3, 2.)
row 4: (4, 2.)
row 5: (5, 2.)
row 6: (6, 2.)
row 7: (7, 2.)
row 8: (8, 2.)
row 9: (9, 2.)

>
Where is the "(0.0,0.0) was inserted whereas it shouldn't have."?


Barry


> In parallel, you'll see that an error "Inserting a new nonzero at global 
> row/column" is triggered.
>
> In both cases, MatSetValues tries to insert a zero value whereas 
> IGNORE_ZERO_ENTRIES was set. That's what Barry is looking into, if I'm not 
> mistaken.
>
>
>
> Thibaut
>
>
>
> On 24/10/2019 02:31, Matthew Knepley wrote:
>> On Tue, Oct 22, 2019 at 1:37 PM Thibaut Appel  
>> wrote:
>> Hi both,
>>
>> Please find attached a tiny example (in Fortran, sorry Matthew) that - I 
>> think - reproduces the problem we mentioned.
>>
>> Let me know.
>>
>> Okay, I converted to C so I could understand, and it runs fine for me:
>>
>> master *:~/Downloads/tmp$ PETSC_ARCH=arch-master-complex-debug make main
>> /PETSc3/petsc/bin/mpicc -o main.o -c -Wall -Wwrite-strings 
>> -Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector 
>> -Qunused-arguments -fvisibility=hidden -g3   
>> -I/PETSc3/petsc/petsc-dev/include 
>> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include 
>> -I/PETSc3/petsc/include -I/opt/X11/include`pwd`/main.c
>> /PETSc3/petsc/bin/mpicc -Wl,-multiply_defined,suppress -Wl,-multiply_defined 
>> -Wl,suppress -Wl,-commons,use_dylibs -Wl,-search_paths_first 
>> -Wl,-no_compact_unwind  -Wall -Wwrite-strings -Wno-strict-aliasing 
>> -Wno-unknown-pragmas -fstack-protector -Qunused-arguments 
>> -fvisibility=hidden -g3  -o main main.o 
>> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib 
>> -L/opt/X11/lib -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
>> -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
>> -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 
>> -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco 
>> -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl -lmpichf90 
>> -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 -lstdc++ -ldl
>> master *:~/Downloads/tmp$ ./main
>> After first assembly:
>> Mat Object: 1 MPI processes
>>   type: seqaij
>> row 0: (0, 1. + 1. i)
>> row 1: (1, 1. + 1. i)
>> row 2: (2, 1. + 1. i)
>> row 3: (3, 1. + 1. i)
>> row 4: (4, 1. + 1. i)
>> row 5: (5, 1. + 1. i)
>> row 6: (6, 1. + 1. i)
>> row 7: (7, 1. + 1. i)
>> row 8: (8, 1. + 1. i)
>> row 9: (9, 1. + 1. i)
>> After second assembly:
>> Mat Object: 1 MPI processes
>>   type: seqaij
>> row 0: (0, 1. + 1. i)
>> row 1: (1, 1. + 1. i)
>> row 2: (2, 1. + 1. i)
>> row 3: (3, 1. + 1. i)
>> row 4: (4, 1. + 1. i)
>> row 5: (5, 1. + 1. i)
>> row 6: (6, 1. + 1. i)
>> row 7: (7, 1. + 1. i)
>> row 8: (8, 1. + 1. i)
>> row 9: (9, 1. + 1. i)
>> row 0 col: 9 val: 0. + 0. i
>>
>> I attach my code.  So then I ran your Fortran:
>>
>> /PETSc3/petsc/bin/mpif90 -c -Wall -ffree-line-length-0 
>> -Wno-unused-dummy-argument -Wno-unused-variable -g
>> -I/PETSc3/petsc/petsc-dev/include 
>> -I/PETSc3/petsc/petsc-dev/arch-master-complex-debug/include 
>> -I/PETSc3/petsc/include -I/opt/X11/include  -o main2.o main2.F90
>> /PETSc3/petsc/bin/mpif90 -Wl,-multiply_defined,suppress 
>> -Wl,-multiply_defined -Wl,suppress -Wl,-commons,use_dylibs 
>> -Wl,-search_paths_first -Wl,-no_compact_unwind  -Wall -ffree-line-length-0 
>> -Wno-unused-dummy-argument -Wno-unused-variable -g   -o main2 main2.o 
>> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> -Wl,-rpath,/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> -L/PETSc3/petsc/petsc-dev/arch-master-complex-debug/lib 
>> -Wl,-rpath,/PETSc3/petsc/lib -L/PETSc3/petsc/lib -Wl,-rpath,/opt/X11/lib 
>> -L/opt/X11/lib -Wl,-rpath,/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
>> -L/usr/local/lib/gcc/x86_64-apple-darwin10.4.0/4.6.0 
>> -Wl,-rpath,/usr/local/lib -L/usr/local/lib -lpetsc -lfftw3_mpi -lfftw3 
>> -llapack -lblas -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lchaco 
>> -lparmetis -lmetis -ltriangle -lz -lX11 -lctetgen -lstdc++ -ldl -lmpichf90 
>> -lpmpich -lmpich -lopa -lmpl -lpthread -lgfortran -lgcc_s.10.5 -lstdc++ -ldl
>> master *:~/Downloads/tmp$ ./main2
>>  After first assembly:
>> Mat Object: 1 MPI processes
>>   type: seqaij
>> row 0: (0, 1.)
>> row 1: (1,