from:"Patrick Sanan"

Re: [petsc-users] Mat created by DMStag cannot access ghost points

2022-06-10 Thread Patrick Sanan

Sorry about the long delay on this.
https://gitlab.com/petsc/petsc/-/merge_requests/5329




Am Do., 2. Juni 2022 um 15:01 Uhr schrieb Matthew Knepley :

> On Thu, Jun 2, 2022 at 8:59 AM Patrick Sanan 
> wrote:
>
>> Thanks, Barry and Changqing! That seems reasonable to me, so I'll make an
>> MR with that change.
>>
>
> Hi Patrick,
>
> In the MR, could you add that option to all places we internally use
> Preallocator? I think we mean it for those.
>
>   Thanks,
>
>  Matt
>
>
>> Am Mi., 1. Juni 2022 um 20:06 Uhr schrieb Barry Smith :
>>
>>>
>>>   This appears to be a bug in the DMStag/Mat preallocator code. If you
>>> add after the DMCreateMatrix() line in your code
>>>
>>> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, PETSC_FALSE));
>>>
>>> Your code will run correctly.
>>>
>>>   Patrick and Matt,
>>>
>>>   MatPreallocatorPreallocate_Preallocator() has
>>>
>>> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, p->nooffproc));
>>>
>>> to make the assembly of the stag matrix from the preallocator matrix a
>>> little faster,
>>>
>>> but then it never "undoes" this call. Hence the matrix is left in the
>>> state where it will error if someone sets values from a different rank
>>> (which they certainly can using DMStagMatSetValuesStencil().
>>>
>>>  I think you need to clear the NO_OFF_PROC at the end
>>> of MatPreallocatorPreallocate_Preallocator() because just because the
>>> preallocation process never needed communication does not mean that when
>>> someone puts real values in the matrix they will never use communication;
>>> they can put in values any dang way they please.
>>>
>>> I don't know why this bug has not come up before.
>>>
>>>   Barry
>>>
>>>
>>> On May 31, 2022, at 11:08 PM, Ye Changqing 
>>> wrote:
>>>
>>> Dear all,
>>>
>>> [BugReport.c] is a sample code, [BugReportParallel.output] is the output
>>> when execute BugReport with mpiexec, [BugReportSerial.output] is the output
>>> in serial execution.
>>>
>>> Best,
>>> Changqing
>>>
>>> --
>>> *发件人:* Dave May 
>>> *发送时间:* 2022年5月31日 22:55
>>> *收件人:* Ye Changqing 
>>> *抄送:* petsc-users@mcs.anl.gov 
>>> *主题:* Re: [petsc-users] Mat created by DMStag cannot access ghost points
>>>
>>>
>>>
>>> On Tue 31. May 2022 at 16:28, Ye Changqing 
>>> wrote:
>>>
>>> Dear developers of PETSc,
>>>
>>> I encountered a problem when using the DMStag module. The program could
>>> be executed perfectly in serial, while errors are thrown out in parallel
>>> (using mpiexec). Some rows in Mat cannot be accessed in local processes
>>> when looping all elements in DMStag. The DM object I used only has one DOF
>>> in each element. Hence, I could switch to the DMDA module easily, and the
>>> program now is back to normal.
>>>
>>> Some snippets are below.
>>>
>>> Initialise a DMStag object:
>>> PetscCall(DMStagCreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE,
>>> DM_BOUNDARY_NONE, M, N, PETSC_DECIDE, PETSC_DECIDE, 0, 0, 1,
>>> DMSTAG_STENCIL_BOX, 1, NULL, NULL, &(s_ctx->dm_P)));
>>> Created a Mat:
>>> PetscCall(DMCreateMatrix(s_ctx->dm_P, A));
>>> Loop:
>>> PetscCall(DMStagGetCorners(s_ctx->dm_V, , , , ,
>>> , , , , ));
>>> for (ey = starty; ey < starty + ny; ++ey)
>>> for (ex = startx; ex < startx + nx; ++ex)
>>> {
>>> ...
>>> PetscCall(DMStagMatSetValuesStencil(s_ctx->dm_P, *A, 2, [0], 2,
>>> [0], _A[0][0], ADD_VALUES));  // The traceback shows the problem is
>>> in here.
>>> }
>>>
>>>
>>> In addition to the code or MWE, please forward us the complete stack
>>> trace / error thrown to stdout.
>>>
>>> Thanks,
>>> Dave
>>>
>>>
>>>
>>> Best,
>>> Changqing
>>>
>>> 
>>>
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>

Re: [petsc-users] Mat created by DMStag cannot access ghost points

2022-06-02 Thread Patrick Sanan

Thanks, Barry and Changqing! That seems reasonable to me, so I'll make an
MR with that change.

Am Mi., 1. Juni 2022 um 20:06 Uhr schrieb Barry Smith :

>
>   This appears to be a bug in the DMStag/Mat preallocator code. If you add
> after the DMCreateMatrix() line in your code
>
> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, PETSC_FALSE));
>
> Your code will run correctly.
>
>   Patrick and Matt,
>
>   MatPreallocatorPreallocate_Preallocator() has
>
> PetscCall(MatSetOption(A, MAT_NO_OFF_PROC_ENTRIES, p->nooffproc));
>
> to make the assembly of the stag matrix from the preallocator matrix a
> little faster,
>
> but then it never "undoes" this call. Hence the matrix is left in the
> state where it will error if someone sets values from a different rank
> (which they certainly can using DMStagMatSetValuesStencil().
>
>  I think you need to clear the NO_OFF_PROC at the end
> of MatPreallocatorPreallocate_Preallocator() because just because the
> preallocation process never needed communication does not mean that when
> someone puts real values in the matrix they will never use communication;
> they can put in values any dang way they please.
>
> I don't know why this bug has not come up before.
>
>   Barry
>
>
> On May 31, 2022, at 11:08 PM, Ye Changqing 
> wrote:
>
> Dear all,
>
> [BugReport.c] is a sample code, [BugReportParallel.output] is the output
> when execute BugReport with mpiexec, [BugReportSerial.output] is the output
> in serial execution.
>
> Best,
> Changqing
>
> --
> *发件人:* Dave May 
> *发送时间:* 2022年5月31日 22:55
> *收件人:* Ye Changqing 
> *抄送:* petsc-users@mcs.anl.gov 
> *主题:* Re: [petsc-users] Mat created by DMStag cannot access ghost points
>
>
>
> On Tue 31. May 2022 at 16:28, Ye Changqing 
> wrote:
>
> Dear developers of PETSc,
>
> I encountered a problem when using the DMStag module. The program could be
> executed perfectly in serial, while errors are thrown out in parallel
> (using mpiexec). Some rows in Mat cannot be accessed in local processes
> when looping all elements in DMStag. The DM object I used only has one DOF
> in each element. Hence, I could switch to the DMDA module easily, and the
> program now is back to normal.
>
> Some snippets are below.
>
> Initialise a DMStag object:
> PetscCall(DMStagCreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE,
> DM_BOUNDARY_NONE, M, N, PETSC_DECIDE, PETSC_DECIDE, 0, 0, 1,
> DMSTAG_STENCIL_BOX, 1, NULL, NULL, &(s_ctx->dm_P)));
> Created a Mat:
> PetscCall(DMCreateMatrix(s_ctx->dm_P, A));
> Loop:
> PetscCall(DMStagGetCorners(s_ctx->dm_V, , , , ,
> , , , , ));
> for (ey = starty; ey < starty + ny; ++ey)
> for (ex = startx; ex < startx + nx; ++ex)
> {
> ...
> PetscCall(DMStagMatSetValuesStencil(s_ctx->dm_P, *A, 2, [0], 2,
> [0], _A[0][0], ADD_VALUES));  // The traceback shows the problem is
> in here.
> }
>
>
> In addition to the code or MWE, please forward us the complete stack trace
> / error thrown to stdout.
>
> Thanks,
> Dave
>
>
>
> Best,
> Changqing
>
> 
>
>
>

Re: [petsc-users] MatColoring

2022-05-13 Thread Patrick Sanan

Am Mi., 11. Mai 2022 um 21:28 Uhr schrieb Barry Smith :

>
>   It is unlikely that the true number of colors is much less than 230 so I
> don't think you can save by reducing the number of colors more.
>
>   Are you using IS_COLORING_LOCAL it eliminates much parallel
> communication so you would definitely benefit by using this, depending on
> how your FormFunction is written you may need to refactor your code
> slightly.
>
>   There is also a hidden parameter which is a tradeoff of memory usage vs
> speed. If memory is not an issue for you then you could change this
> parameter. It is computed in MatFDColoringCreate_MPIXAIJ() called bcols You
> can just hardware bcols to iscoloring->n This may help a bit.
>
>The only way to drastically improve the time after the above is to
> reorganize the PETSc code and your code to evaluate multiple function
> evaluations "at the same time" but this is a major project (it has to be
> done for GPUs).
>
>   MATCOLORINGMIS doesn't exist, it is a typo in the docs.
>
https://gitlab.com/petsc/petsc/-/merge_requests/5249

>
>   Barry
>
>
> On May 11, 2022, at 2:25 PM, Jorti, Zakariae  wrote:
>
> Hello,
>
> I have used the -mat_coloring_view and -mat_fd_coloring_view flags as you
> suggested and it turns out that the coloring used was MATCOLORINGSL.
> I have also tried other methods and obtained the following numbers of
> function evaluations per Jacobian building:
> - MATCOLORINGJP : 593
> - MATCOLORINGID : 327
> - MATCOLORINGSL : 321
> - MATCOLORINGGREEDY : 316
> - MATCOLORINGLF : 230
>
> Is there any parameter/trick we can try to further cut down the number of
> function evaluations? And also how could we use MATCOLORINGMIS as this
> latter does not appear in the MatColoringType options?
> Many thanks.
>
> Zakariae
> --
> *From:* Tang, Qi 
> *Sent:* Tuesday, May 10, 2022 10:51:07 AM
> *To:* Barry Smith; petsc-users@mcs.anl.gov
> *Cc:* Jorti, Zakariae; Tang, Xianzhu
> *Subject:* [EXTERNAL] Re: [petsc-users] MatColoring
>
> We are using SNES + TS + dmstag. The current bottleneck is the number of
> residual evaluation (more than 300 per Jacobian building using the default
> coloring from dmstag). We talked to Patrick and we are not sure how to
> improve further.
>
> So it looks like we should play with mat_coloring_type and see if others
> give us better performance.
>
> If there is anything else we can play with, please also let us know. We
> also lag Jacobian and only build once every three Newton iterations, which
> works well. Thanks,
>
> Qi
>
>
> On May 10, 2022, at 10:35 AM, Barry Smith  wrote:
>
>
>   This depends to some degree on how you are accessing applying the
> Jacobian construction process.
>
>If you are using SNES or TS then the SNES object handles most of the
> work of organizing the PETSc objects needed to compute the Jacobian and you
> can control the choices via the options database.
>
>With SNES/TS using a DM you can skip calling SNESSetJacobian() and
> simply use the option -snes_fd_color to have SNES compute the Jacobian for
> you. By default, the coloring is obtained from the DM (which is generally
> the best available coloring), you can have it use a coloring computed
> directly from the matrix structure with the
> options -snes_fd_color_use_mat -mat_coloring_type  jp power sl lf ld or
> greedy (see MatColoringType, MatColoringSetFromOptions)
> Use  -mat_fd_coloring_view to get information on the computation of the
> Jacobian from the coloring. Use  -mat_coloring_view to get information on
> the coloring used. (see MatFDColoringSetFromOptions)
>
>   If you are using SNES/TS but not using a DM you need to compute the
> nonzero structure of the matrix yourself into J and call
> SNESSetJacobian(snes,J,J,SNESComputeJacobianDefaultColor,matfdcoloring);
> you need to create the matfdcoloring object with MatFDColoringCreate()
> using an iscoloring you obtained with MatColoringCreate() etc.  The same
> command line arguments as above allow you to control the coloring algorithm
> used and to view them etc.
>
>   Barry
>
>
>
>
>
>
>
>
> On May 10, 2022, at 11:40 AM, Jorti, Zakariae via petsc-users <
> petsc-users@mcs.anl.gov> wrote:
>
> Hi,
>
> I am solving a non-linear problem and using a finite difference
> approximation with coloring to compute the Jacobian matrix.
>
> There are several coloring algorithms available in PETSc as indicated here:
> https://petsc.org/release/docs/manualpages/Mat/MatColoring.html
> 
>
> And I was wondering how to switch from one to another in the Jacobian
> setup routine and also how to check which coloring algorithm I am currently
> using.
>
> Thank you.
>
> Zakariae Jorti
>
>
>

Re: [petsc-users] Combining DMDA and DMStag

2022-04-25 Thread Patrick Sanan

In that case, from your original message, it seems like the issue might be
in the

// populate edofF, edofT

Or, perhaps, it's simply due to the fact that the numbering/ordering
depends on the number of ranks. This is illustrated in this image in the
manual: https://petsc.org/release/docs/manual/vec/#fig-daao

This is for DMDA but DMStag does something similar. There is a "natural"
ordering, which is the ordering you'd get with a single rank. There's also
"PETSc ordering" which depends on the number of ranks - in this case each
rank's portion of the vector is a contiguous range.

If this might be the explanation for your problem, note that for DMDA there
are already some helper functions which can map "PETSc" to "natural"
ordering. While these obviously require a lot of communication between
ranks and so you'd want to avoid them in production runs, these are useful
in situations where you want to directly compare vectors on different
numbers of ranks for diagnostic or I/O purposes.

E.g. see this man page:
https://petsc.org/release/docs/manualpages/DMDA/DMDANaturalToGlobalBegin.html

I haven't implemented the equivalent for DMStag, but if it's needed I can
look at adding it.

Am Fr., 22. Apr. 2022 um 17:28 Uhr schrieb Carl-Johan Thore <
carl-johan.th...@liu.se>:

> Thanks for the quick replies!
>
> Using dmstag for the temperature, or even for both fields is a possibility
> of course. Preferably however I'd like to keep the temp. code as it is and
> be able to quickly switch between different methods for the flow, based on
> dmda, dmstag and so on, so I'll try a bit more with the dmda-dmstag
> approach first.
>
>
>
>
>
> --
> *From:* Patrick Sanan 
> *Sent:* 22 April 2022 17:04
> *To:* Barry Smith 
> *Cc:* Carl-Johan Thore ; petsc-users@mcs.anl.gov
> 
> *Subject:* Re: [petsc-users] Combining DMDA and DMStag
>
>
>
> Am Fr., 22. Apr. 2022 um 16:04 Uhr schrieb Barry Smith :
>
>
>We would need more details as to exactly what goes wrong to determine
> any kind of fix; my guess would be that the layout of the velocity vectors
> and temperature vectors is slightly different since the DMDA uses nelx+1
> while the stag uses nelx and may split things up slightly differently in
> parallel. You could try very small problems with say 2 or 4 ranks and put
> known values into the vectors and look at the ghost point update locations
> and exact locations in the local and global vectors to make sure everything
> is where you expect it to be.
>
>   There is also the possibility of using a DMStag for the temperature
> instead of DMDA since DMDA provides essentially a subset of the
> functionality of DMDA to get a more consistent layout of the unknowns in
> the two vectors.
>
>
> This was going to be my first suggestion as well - one way to ensure
> compatibility would be to use DMStag for everything. E.g. you could create
> your temperature DMStag with only vertex/corner (dof30 degrees of freedom,
> and then create one or more a "compatible" DMStags (same elements on each
> MPI rank) for your other fields, using DMStagCreateCompatibleDMStag() .
>
>   Finally, one could use a single DMStag with all the unknowns but treat
> subvectors of the unknowns differently in your discretization and solve
> process. So assign velocity values (I guess) to the cell faces and
> temperature to the cell vertices, this will give consistent parallel
> decomposition of values but you will have to manage the fact that the
> unknowns are interlaced into a single vector so your solver portions of the
> code may need to "pull" out appropriate subvectors.
>
>   Barry
>
>
> On Apr 22, 2022, at 9:45 AM, Carl-Johan Thore 
> wrote:
>
> Hi!
>
> I'm working on a convection-diffusion heat transfer problem. The
> temperature
> is discretized using standard Q1 elements and a DMDA. The flow is modelled
> using a stabilized Q1-Q0 method for which DMStag seemed like a good
> choice. The codes for the temperature
> and flow work fine separately (both in serial and parallel), but when
> combined and running
> in parallel, a problem sometimes arises in the assembly of the thermal
> system matrix.
> Here’s a rough sketch of the combined code:
>
> // Create dmda for thermal problem and dmstag for flow problem
> DMDACreate3d(PETSC_COMM_WORLD, bx, by, bz, DMDA_STENCIL_BOX, nelx+1,
> nely+1, nelz+1, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE,
> 1, stencilWidth, 0, 0, 0,
> );
> …
> // A bit of code to adjust Lx,Ly,Lz so that dmthermal and dmflow are
> compatible in the sense of having the same
> // local elements
> …
> DMStagCreate3d(PETSC_COMM_WORLD, bx, by, bz, nelx, nely, nelz, md,nd,pd,
> 3,0,

Re: [petsc-users] Combining DMDA and DMStag

2022-04-22 Thread Patrick Sanan

Am Fr., 22. Apr. 2022 um 16:04 Uhr schrieb Barry Smith :

>
>We would need more details as to exactly what goes wrong to determine
> any kind of fix; my guess would be that the layout of the velocity vectors
> and temperature vectors is slightly different since the DMDA uses nelx+1
> while the stag uses nelx and may split things up slightly differently in
> parallel. You could try very small problems with say 2 or 4 ranks and put
> known values into the vectors and look at the ghost point update locations
> and exact locations in the local and global vectors to make sure everything
> is where you expect it to be.
>
>   There is also the possibility of using a DMStag for the temperature
> instead of DMDA since DMDA provides essentially a subset of the
> functionality of DMDA to get a more consistent layout of the unknowns in
> the two vectors.
>
>
This was going to be my first suggestion as well - one way to ensure
compatibility would be to use DMStag for everything. E.g. you could create
your temperature DMStag with only vertex/corner (dof30 degrees of freedom,
and then create one or more a "compatible" DMStags (same elements on each
MPI rank) for your other fields, using DMStagCreateCompatibleDMStag() .

  Finally, one could use a single DMStag with all the unknowns but treat
> subvectors of the unknowns differently in your discretization and solve
> process. So assign velocity values (I guess) to the cell faces and
> temperature to the cell vertices, this will give consistent parallel
> decomposition of values but you will have to manage the fact that the
> unknowns are interlaced into a single vector so your solver portions of the
> code may need to "pull" out appropriate subvectors.
>
>   Barry
>
>
> On Apr 22, 2022, at 9:45 AM, Carl-Johan Thore 
> wrote:
>
> Hi!
>
> I'm working on a convection-diffusion heat transfer problem. The
> temperature
> is discretized using standard Q1 elements and a DMDA. The flow is modelled
> using a stabilized Q1-Q0 method for which DMStag seemed like a good
> choice. The codes for the temperature
> and flow work fine separately (both in serial and parallel), but when
> combined and running
> in parallel, a problem sometimes arises in the assembly of the thermal
> system matrix.
> Here’s a rough sketch of the combined code:
>
> // Create dmda for thermal problem and dmstag for flow problem
> DMDACreate3d(PETSC_COMM_WORLD, bx, by, bz, DMDA_STENCIL_BOX, nelx+1,
> nely+1, nelz+1, PETSC_DECIDE, PETSC_DECIDE, PETSC_DECIDE,
> 1, stencilWidth, 0, 0, 0,
> );
> …
> // A bit of code to adjust Lx,Ly,Lz so that dmthermal and dmflow are
> compatible in the sense of having the same
> // local elements
> …
> DMStagCreate3d(PETSC_COMM_WORLD, bx, by, bz, nelx, nely, nelz, md,nd,pd,
> 3,0,0,0,DMSTAG_STENCIL_BOX,stencilWidth,Lx,Ly,Lz,);
>
>
> PetscInt edofT[8];  // 8-noded element with 1 temp DOF per node
> DMStagStencil edofF[24];   // 8 nodes with 3 velocity DOFs each
>
> // Assemble thermal system matrix K
> for (PetscInt e=0 ...)   // Loop over local elements
> {
> // Populate edofF, edofT
>  // Get element velocities in ue from local
> velocity vector uloc
>
> DMStagVecGetValuesStencil(dmflow,uloc,24,edof,ue);
>   ...
>  Ke = Ke_diffusion + Ke_convection(ue)
>   ...
>  MatSetValuesLocal(K, 8, edofT, 8, edofT, Ke,
> ADD_VALUES);
> }
>
> This always works fine in serial, but depending on the mesh and the number
> of ranks,
> we don't always get the correct values in the element velocity vector ue.
> I suspect
> this has something to do with the ordering of the elements and/or the
> DOFs, because
> the elements in the global velocity vector are always the same but their
> order may change
> (judging from the output of VecView at least).
>
> Is it possible to ensure compatibility between the dm:s, or find some kind
> of mapping
> between them, so that something along the lines of the code above always
> works?
>
> Kind regards,
> Carl-Johan
>
>
>

Re: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16

2022-02-02 Thread Patrick Sanan

There is also the hedge of adding a parameter and API function to control
which of these two behaviors is used, and if trying to preallocate twice,
throwing an error that instructs the user how to change the behavior,
noting that it will increase peak memory usage.

Am Di., 1. Feb. 2022 um 17:07 Uhr schrieb Jed Brown :

> Stefano Zampini  writes:
>
> > Il giorno mar 1 feb 2022 alle ore 18:34 Jed Brown  ha
> > scritto:
> >
> >> Patrick Sanan  writes:
> >>
> >> > Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown  >:
> >> >
> >> >> Patrick Sanan  writes:
> >> >>
> >> >> > Sorry about the delay on this. I can reproduce.
> >> >> >
> >> >> > This regression appears to be a result of this optimization:
> >> >> > https://gitlab.com/petsc/petsc/-/merge_requests/4273
> >> >>
> >> >> Thanks for tracking this down. Is there a reason to prefer
> preallocating
> >> >> twice
> >> >>
> >> >>ierr =
> >> >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr);
> >> >>ierr =
> >> >>
> >>
> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr);
> >> >>
> >> >> versus using MatDuplicate() or MatConvert()?
> >> >>
> >>
> >
> > Jed
> >
> > this is not the point. Suppose you pass around only a preallocator, but
> do
> > not pass around the matrices. Reusing the preallocator should be allowed.
>
> The current code is not okay (crashing is not okay), but we should decide
> whether to consume the preallocator or to retain the data structure. Peak
> memory use is the main reason hash-based allocation hasn't been default and
> wasn't adopted sooner. Retaining the hash until the preallocator is
> destroyed increases that peak.
>

Re: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16

2022-02-01 Thread Patrick Sanan

That works, as in the attached example - Marius, would that work for your
case?

Am Di., 1. Feb. 2022 um 16:33 Uhr schrieb Jed Brown :

> Patrick Sanan  writes:
>
> > Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown :
> >
> >> Patrick Sanan  writes:
> >>
> >> > Sorry about the delay on this. I can reproduce.
> >> >
> >> > This regression appears to be a result of this optimization:
> >> > https://gitlab.com/petsc/petsc/-/merge_requests/4273
> >>
> >> Thanks for tracking this down. Is there a reason to prefer preallocating
> >> twice
> >>
> >>ierr =
> >> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr);
> >>ierr =
> >>
> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr);
> >>
> >> versus using MatDuplicate() or MatConvert()?
> >>
> >
> > Maybe if your preallocation is an overestimate for each of two different
> > post-assembly non-zero structures in A and A_duplicate?
>
> Even then, why not preallocate A and duplicate immediately, before
> compressing out zeros?
>


ex251.c
Description: Binary data

Re: [petsc-users] [petsc-dev] MatPreallocatorPreallocate segfault with PETSC 3.16

2022-02-01 Thread Patrick Sanan

Am Di., 1. Feb. 2022 um 16:20 Uhr schrieb Jed Brown :

> Patrick Sanan  writes:
>
> > Sorry about the delay on this. I can reproduce.
> >
> > This regression appears to be a result of this optimization:
> > https://gitlab.com/petsc/petsc/-/merge_requests/4273
>
> Thanks for tracking this down. Is there a reason to prefer preallocating
> twice
>
>ierr =
> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A);CHKERRQ(ierr);
>ierr =
> MatPreallocatorPreallocate(preallocator,PETSC_TRUE,A_duplicate);CHKERRQ(ierr);
>
> versus using MatDuplicate() or MatConvert()?
>

Maybe if your preallocation is an overestimate for each of two different
post-assembly non-zero structures in A and A_duplicate?

Re: [petsc-users] MatPreallocatorPreallocate segfault with PETSC 3.16

2022-02-01 Thread Patrick Sanan

Sorry about the delay on this. I can reproduce.

This regression appears to be a result of this optimization:
https://gitlab.com/petsc/petsc/-/merge_requests/4273

The changes there including having MatPreallocator destroy its internal
hash structure within MatPreallocatorPreallocate(), which allows for a
lower overall memory footprint but prevents usage of the same
MatPreallocate object for two Mats. The error you see is because this hash
structure was destroyed during the first preallocation. We didn't catch
this because our test suite doesn't test that usage.

cc'ing PETSc dev because I'm not sure how to best resolve this - enforce
that a MatPreallocator is only "good once", remove the PetscHSetIJDestroy()
calls and accept the bigger memory footprint, or something else more clever?

My test to reproduce with C, which can be included in our fix in
src/mat/tests , attached.

Am Mo., 24. Jan. 2022 um 10:33 Uhr schrieb Marius Buerkle :

>
> Hi,
>
> I try to use MatPreallocatorPreallocate to allocate a MATMPIAIJ matrix A .
> I define the MATPREALLOCATOR preM with MatSetValues and then call
> MatPreallocatorPreallocate to get A. This works on the first call to
> MatPreallocatorPreallocate, but if I call MatPreallocatorPreallocate  again
> with the same preM to get another matrix B then I get a segfault, although
> the program continues to run (see below). It worked with PETSC 3.15 but
> with 3.16 I stopped working.
> When I check mat_info_nz_allocated and mat_info_nz_used for the allocated
> matrix it looks correct for the first call, but on the second call
> mat_info_nz_used is 0. I also attached a minimal example.
>
>
> [0]PETSC ERROR: - Error Message
> --
> [0]PETSC ERROR: Null argument, when expecting valid pointer
> [1]PETSC ERROR: - Error Message
> --
> [0]PETSC ERROR: Null Pointer: Parameter # 1
> [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [0]PETSC ERROR: [1]PETSC ERROR: Null argument, when expecting valid pointer
> [1]PETSC ERROR: Petsc Development GIT revision: v3.16.3-686-g5e81a90  GIT
> Date: 2022-01-23 05:13:26 +
> [0]PETSC ERROR: ./prem_test on a  named cd001 by cdfmat_marius Mon Jan 24
> 18:21:17 2022
> [0]PETSC ERROR: Null Pointer: Parameter # 1
> [1]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
> [1]PETSC ERROR: Configure options
> --prefix=/home/cdfmat_marius/prog/petsc/petsc_main_dbg
> --with-scalar-type=complex --with-fortran-kernels=1 --with-64-bit-indices=0
> --CC=mpiicc --COPTFLAGS="-g -traceback" --CXX=mpiicpc --CXXOPTFLAGS="-g
> -traceback" --FC=mpiifort --FOPTFLAGS="-g -traceback" --with-mpi=1
> --with-x=0 --with-cuda=0
> --download-parmetis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.parmetis.tar.gz
> --download-parmetis-commit=HEAD
> --download-metis=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.metis.tar.gz
> --download-metis-commit=HEAD
> --download-slepc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.slepc_main.tar.gz
> --download-slepc-commit=HEAD
> --download-superlu_dist=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.superlu_dist.tar.gz
> --download-superlu_dist-commit=HEAD
> --download-mumps=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.mumps.tar.gz
> --download-mumps-commit=HEAD
> --download-hypre=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.hypre.tar.gz
> --download-hypre-commit=HEAD
> --download-hwloc=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/hwloc-2.5.0.tar.gz
> --download-sowing=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.sowing.tar.gz
> --download-elemental=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.elemental.tar.gz
> --download-elemental-commit=HEAD
> --download-make=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/make-4.2.1-6.fc28.tar.gz
> --download-ptscotch=/home/cdfmat_marius/prog/petsc/git/petsc_main/externalpackages/git.ptscotch.tar.gz
> --download-ptscotch-commit=HEAD --with-openmp=0 --with-pthread=0
> --with-cxx-dialect=c++11 --with-debugging=1 --with-cuda=0 --with-cudac=0
> --with-valgrind=0 --with-blaslapack-lib="-mkl=sequential
> -lmkl_scalapack_lp64 -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl"
> --with-scalapack-lib="-mkl=sequential -lmkl_scalapack_lp64
> -lmkl_blacs_intelmpi_lp64 -lpthread -lm -ldl"
> --with-mkl_pardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl
> --with-mkl_cpardiso-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl
> --with-mkl_sparse-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl
> --with-mkl_sparse_optimize-dir=/home/appli/intel/compilers_and_libraries_2020.4.304/linux/mkl
> [0]PETSC ERROR: Petsc Development GIT revision: v3.16.3-686-g5e81a90  GIT
>

Re: [petsc-users] [EXTERNAL] Re: Question about PCFieldSplit

2022-01-31 Thread Patrick Sanan

The current behavior is that a single IS is returned for each stratum, so
if you have 2 unknowns on vertices, it should still return a single IS
including both of those unknowns per vertex. Am I understanding that that's
working as expected but you need *two* ISs in that case?


Am Mo., 31. Jan. 2022 um 16:29 Uhr schrieb Jorti, Zakariae :

> Hi Patrick,
>
>
> Thanks for your recent updates on DMStag.
> After getting the Finite Difference Coloring to work with our solver, I
> was testing that DMCreateFieldDecomposition routine that you added last
> week.
> It seems to work fine when there is only one unknown per location (i.e.
> one unknown on vertices and/or one unknown on edges and/or one unknown on
> faces and/or one unknown on elements).
> That being said, when there is more than one unknown in some location
> (let's say 2 unknowns on vertices for instance), I could not get the ISs
> for those two unknowns with that routine.
> Should I still rely on PCFieldSplitSetDetectSaddlePoint in this case?
> Many thanks once again.
>
>
> Kind regards,
>
>
> Zakariae
> --
> *From:* Patrick Sanan 
> *Sent:* Tuesday, January 25, 2022 5:36:17 AM
> *To:* Matthew Knepley
> *Cc:* Jorti, Zakariae; petsc-users@mcs.anl.gov; Tang, Xianzhu; Dave May
> *Subject:* [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit
>
> Here is an MR which intends to introduce some logic to support
> DMCreateFieldDecomposition(). It doesn't use the PetscSection approach,
> which might be preferable, but nonetheless is a necessary component so It'd
> like to get it in, even if it has room for further optimization. Hopefully
> this can be followed fairly soon with some more examples and tests using
> PCFieldSplit itself.
>
> https://gitlab.com/petsc/petsc/-/merge_requests/4740
>
> Am Mi., 23. Juni 2021 um 12:15 Uhr schrieb Matthew Knepley <
> knep...@gmail.com>:
>
>> On Wed, Jun 23, 2021 at 12:51 AM Patrick Sanan 
>> wrote:
>>
>>> Hi Zakariae -
>>>
>>> The usual way to do this is to define an IS (index set) with the degrees
>>> of freedom of interest for the rows, and another one for the columns, and
>>> then use MatCreateSubmatrix [1] .
>>>
>>> There's not a particularly convenient way to create an IS with the
>>> degrees of freedom corresponding to a particular "stratum" (i.e. elements,
>>> faces, edges, or vertices) of a DMStag, but fortunately I believe we have
>>> some code to do exactly this in a development branch.
>>>
>>> I'll track it down and see if it can quickly be added to the main branch.
>>>
>>
>> Note that an easy way to keep track of this would be to create a section
>> with the different locations as fields. This Section could then
>> easily create the ISes, and could automatically interface with
>> PCFIELDSPLIT.
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>>
>>> [1]:
>>> https://petsc.org/release/docs/manualpages/Mat/MatCreateSubMatrix.html
>>>
>>> Am 22.06.2021 um 22:29 schrieb Jorti, Zakariae :
>>>
>>> Hello,
>>>
>>> I am working on DMStag and I have one dof on vertices (let us call
>>> it V), one dof on edges (let us call it E), one dof on faces ((let us
>>> call it F)) and one dof on cells (let us call it C).
>>> I build a matrix on this DM, and I was wondering if there was a way to
>>> get blocks (or sub matrices) of this matrix corresponding to specific
>>> degrees of freedom, for example rows corresponding to V dofs and columns
>>> corresponding to E dofs.
>>> I already asked this question before and the answer I got was I could
>>> call PCFieldSplitSetDetectSaddlePoint with the diagonal entries being
>>> of the matrix being zero or nonzero.
>>> That worked well. Nonetheless, I am curious to know if there
>>> was another alternative that does not require creating a dummy matrix
>>> with appropriate diagonal entries and solving a dummy linear system with
>>> this matrix to define the splits.
>>>
>>>
>>> Many thanks.
>>>
>>> Best regards,
>>>
>>> Zakariae
>>> --
>>> *From:* petsc-users  on behalf of
>>> Tang, Qi 
>>> *Sent:* Sunday, April 18, 2021 11:51:59 PM
>>> *To:* Patrick Sanan
>>> *Cc:* petsc-users@mcs.anl.gov; Tang, Xianzhu
>>> *Subject:* [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit
>>>
>>> Thanks a lot, Patrick. We appreciate your help.
>>>
>>> Qi
>>>
>

Re: [petsc-users] Finite difference approximation of Jacobian

2022-01-27 Thread Patrick Sanan

Dave convinced me that it's so relatively easy to use MatPreallocator that
we might as well do it here even if we're going to do something more
general later. Here's a draft of how it looks in 1D - most of the changes
in the MR are just putting the guts of the matrix assembly in a separate
function so we can call it twice. The part that actually uses the
MatPreallocator API is very contained and so doesn't seem like it would be
difficult to refactor.
 https://gitlab.com/petsc/petsc/-/merge_requests/4769

Am Mi., 12. Jan. 2022 um 14:53 Uhr schrieb Patrick Sanan <
patrick.sa...@gmail.com>:

> Thanks a lot, all! So given that there's still some debate about whether
> we should even use MATPREALLOCATOR or  a better integration of that hash
> logic , as in Issue 852, I'll proceed with simply aping what DMDA does
> (with apologies for all this code duplication).
>
> One thing I had missed, which I just added, is respect for
> DMSetMatrixPreallocation() / -dm_preallocate_only . Now, when that's
> activated, the previous behavior should be recovered, but by default it'll
> assemble a matrix which can be used for coloring, as is the main objective
> of this thread.
>
> Am Mi., 12. Jan. 2022 um 08:53 Uhr schrieb Barry Smith :
>
>>
>>   Actually given the Subject of this email "Finite difference
>> approximation of Jacobian" what I suggested is A complete story anyways for
>> Patrick since the user cannot provide numerical values (of course Patrick
>> could write a general new hash matrix type and then use it with zeros but
>> that is asking a lot of him).
>>
>>   Regarding Fande's needs. One could rig things so that later one could
>> "flip" back the matrix to again use the hasher for setting values when the
>> contacts change.
>>
>> > On Jan 12, 2022, at 2:48 AM, Barry Smith  wrote:
>> >
>> >
>> >
>> >> On Jan 12, 2022, at 2:22 AM, Jed Brown  wrote:
>> >>
>> >> Because if a user jumps right into MatSetValues() and then wants to
>> use the result, it better have the values. It's true that DM preallocation
>> normally just "inserts zeros" and thus matches what MATPREALLOCATOR does.
>> >
>> >  I am not saying the user jumps right into MatSetValues(). I am saying
>> they do a "value" free assembly to have the preallocation set up and then
>> they do the first real assembly; hence very much JUST a refactorization of
>> MATPREALLOCATE. So the user is "preallocation-aware" but does not have to
>> do anything difficult to determine their pattern analytically.
>> >
>> >  Of course handling the values also is fine and simplifies user code
>> and the conceptual ideas (since preallocation ceases to exist as a concept
>> for users). I have no problem with skipping the "JUST a refactorization of
>> MATPREALLOCATE code" but for Patrick's current pressing need I think a
>> refactorization of MATPREALLOCATE would be fastest to develop hence I
>> suggested it.
>> >
>> >  I did not look at Jed's branch but one way to eliminate the
>> preallocation part completely would be to have something like MATHASH and
>> then when the first assembly is complete do a MatConvert to AIJ or
>> whatever.  The conversion could be hidden inside the assemblyend of the
>> hash so the user need not be aware that something different was done the
>> first time through.  Doing it this way would easily allow multiple MATHASH*
>> implementations (written by different people) that would compete on speed
>> and memory usage.  So there could be MatSetType() and a new
>> MatSetInitialType() where the initial type would be automatically set to
>> some MATHASH if preallocation info is not provided.
>> >
>> >>
>> >> Barry Smith  writes:
>> >>
>> >>> Why does it need to handle values?
>> >>>
>> >>>> On Jan 12, 2022, at 12:43 AM, Jed Brown  wrote:
>> >>>>
>> >>>> I agree with this and even started a branch jed/mat-hash in (yikes!)
>> 2017. I think it should be default if no preallocation functions are
>> called. But it isn't exactly the MATPREALLOCATOR code because it needs to
>> handle values too. Should not be a lot of code and will essentially remove
>> this FAQ and one of the most irritating subtle aspects of new codes using
>> PETSc matrices.
>> >>>>
>> >>>>
>> https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up
>> >&

Re: [petsc-users] Question about PCFieldSplit

2022-01-25 Thread Patrick Sanan

Here is an MR which intends to introduce some logic to support
DMCreateFieldDecomposition(). It doesn't use the PetscSection approach,
which might be preferable, but nonetheless is a necessary component so It'd
like to get it in, even if it has room for further optimization. Hopefully
this can be followed fairly soon with some more examples and tests using
PCFieldSplit itself.

https://gitlab.com/petsc/petsc/-/merge_requests/4740

Am Mi., 23. Juni 2021 um 12:15 Uhr schrieb Matthew Knepley <
knep...@gmail.com>:

> On Wed, Jun 23, 2021 at 12:51 AM Patrick Sanan 
> wrote:
>
>> Hi Zakariae -
>>
>> The usual way to do this is to define an IS (index set) with the degrees
>> of freedom of interest for the rows, and another one for the columns, and
>> then use MatCreateSubmatrix [1] .
>>
>> There's not a particularly convenient way to create an IS with the
>> degrees of freedom corresponding to a particular "stratum" (i.e. elements,
>> faces, edges, or vertices) of a DMStag, but fortunately I believe we have
>> some code to do exactly this in a development branch.
>>
>> I'll track it down and see if it can quickly be added to the main branch.
>>
>
> Note that an easy way to keep track of this would be to create a section
> with the different locations as fields. This Section could then
> easily create the ISes, and could automatically interface with
> PCFIELDSPLIT.
>
>   Thanks,
>
>  Matt
>
>
>>
>> [1]:
>> https://petsc.org/release/docs/manualpages/Mat/MatCreateSubMatrix.html
>>
>> Am 22.06.2021 um 22:29 schrieb Jorti, Zakariae :
>>
>> Hello,
>>
>> I am working on DMStag and I have one dof on vertices (let us call it V), one
>> dof on edges (let us call it E), one dof on faces ((let us call it F)) and
>> one dof on cells (let us call it C).
>> I build a matrix on this DM, and I was wondering if there was a way to
>> get blocks (or sub matrices) of this matrix corresponding to specific
>> degrees of freedom, for example rows corresponding to V dofs and columns
>> corresponding to E dofs.
>> I already asked this question before and the answer I got was I could
>> call PCFieldSplitSetDetectSaddlePoint with the diagonal entries being of
>> the matrix being zero or nonzero.
>> That worked well. Nonetheless, I am curious to know if there was another
>> alternative that does not require creating a dummy matrix with
>> appropriate diagonal entries and solving a dummy linear system with this
>> matrix to define the splits.
>>
>>
>> Many thanks.
>>
>> Best regards,
>>
>> Zakariae
>> --
>> *From:* petsc-users  on behalf of Tang,
>> Qi 
>> *Sent:* Sunday, April 18, 2021 11:51:59 PM
>> *To:* Patrick Sanan
>> *Cc:* petsc-users@mcs.anl.gov; Tang, Xianzhu
>> *Subject:* [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit
>>
>> Thanks a lot, Patrick. We appreciate your help.
>>
>> Qi
>>
>>
>>
>> On Apr 18, 2021, at 11:30 PM, Patrick Sanan 
>> wrote:
>>
>> We have this functionality in a branch, which I'm working on cleaning up
>> to get to master. It doesn't use PETScSection. Sorry about the delay!
>>
>> You can only use PCFieldSplitSetDetectSaddlePoint when your diagonal
>> entries being zero or non-zero defines the splits correctly.
>>
>> Am 17.04.2021 um 21:09 schrieb Matthew Knepley :
>>
>> On Fri, Apr 16, 2021 at 8:39 PM Jorti, Zakariae via petsc-users <
>> petsc-users@mcs.anl.gov> wrote:
>>
>>> Hello,
>>>
>>> I have a DMStag grid with one dof on each edge and face center.
>>> I want to use a PCFieldSplit preconditioner on a Jacobian matrix that I
>>> assume is already split but I am not sure how to determine the fields.
>>> In the DMStag examples (ex2.c and ex3.c), the
>>> function PCFieldSplitSetDetectSaddlePoint is used to determine those fields
>>> based on zero diagonal entries. In my case, I have a Jacobian matrix that
>>> does not have zero diagonal entries.
>>> Can I use that PCFieldSplitSetDetectSaddlePoint in this case?
>>> If not, how should I do?
>>> Should I do like this example (
>>> https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html
>>> <https://urldefense.com/v3/__https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html__;!!HXCxUKc!jbBwV2h9luOW4dtBcNh6n_W1ULQnSVeXpxl0Ef1752s4Hlef-nC2JcnksFSO3Q$>
>>> ):
>>> const PetscInt Bfields[1] = {0},Efields[1] = {1};
>>> KSPGetPC(ksp,);
>>> PCField

Re: [petsc-users] Finite difference approximation of Jacobian

2022-01-12 Thread Patrick Sanan

Thanks a lot, all! So given that there's still some debate about whether we
should even use MATPREALLOCATOR or  a better integration of that hash logic
, as in Issue 852, I'll proceed with simply aping what DMDA does (with
apologies for all this code duplication).

One thing I had missed, which I just added, is respect for
DMSetMatrixPreallocation() / -dm_preallocate_only . Now, when that's
activated, the previous behavior should be recovered, but by default it'll
assemble a matrix which can be used for coloring, as is the main objective
of this thread.

Am Mi., 12. Jan. 2022 um 08:53 Uhr schrieb Barry Smith :

>
>   Actually given the Subject of this email "Finite difference
> approximation of Jacobian" what I suggested is A complete story anyways for
> Patrick since the user cannot provide numerical values (of course Patrick
> could write a general new hash matrix type and then use it with zeros but
> that is asking a lot of him).
>
>   Regarding Fande's needs. One could rig things so that later one could
> "flip" back the matrix to again use the hasher for setting values when the
> contacts change.
>
> > On Jan 12, 2022, at 2:48 AM, Barry Smith  wrote:
> >
> >
> >
> >> On Jan 12, 2022, at 2:22 AM, Jed Brown  wrote:
> >>
> >> Because if a user jumps right into MatSetValues() and then wants to use
> the result, it better have the values. It's true that DM preallocation
> normally just "inserts zeros" and thus matches what MATPREALLOCATOR does.
> >
> >  I am not saying the user jumps right into MatSetValues(). I am saying
> they do a "value" free assembly to have the preallocation set up and then
> they do the first real assembly; hence very much JUST a refactorization of
> MATPREALLOCATE. So the user is "preallocation-aware" but does not have to
> do anything difficult to determine their pattern analytically.
> >
> >  Of course handling the values also is fine and simplifies user code and
> the conceptual ideas (since preallocation ceases to exist as a concept for
> users). I have no problem with skipping the "JUST a refactorization of
> MATPREALLOCATE code" but for Patrick's current pressing need I think a
> refactorization of MATPREALLOCATE would be fastest to develop hence I
> suggested it.
> >
> >  I did not look at Jed's branch but one way to eliminate the
> preallocation part completely would be to have something like MATHASH and
> then when the first assembly is complete do a MatConvert to AIJ or
> whatever.  The conversion could be hidden inside the assemblyend of the
> hash so the user need not be aware that something different was done the
> first time through.  Doing it this way would easily allow multiple MATHASH*
> implementations (written by different people) that would compete on speed
> and memory usage.  So there could be MatSetType() and a new
> MatSetInitialType() where the initial type would be automatically set to
> some MATHASH if preallocation info is not provided.
> >
> >>
> >> Barry Smith  writes:
> >>
> >>> Why does it need to handle values?
> >>>
> >>>> On Jan 12, 2022, at 12:43 AM, Jed Brown  wrote:
> >>>>
> >>>> I agree with this and even started a branch jed/mat-hash in (yikes!)
> 2017. I think it should be default if no preallocation functions are
> called. But it isn't exactly the MATPREALLOCATOR code because it needs to
> handle values too. Should not be a lot of code and will essentially remove
> this FAQ and one of the most irritating subtle aspects of new codes using
> PETSc matrices.
> >>>>
> >>>>
> https://petsc.org/release/faq/#assembling-large-sparse-matrices-takes-a-long-time-what-can-i-do-to-make-this-process-faster-or-matsetvalues-is-so-slow-what-can-i-do-to-speed-it-up
> >>>>
> >>>> Barry Smith  writes:
> >>>>
> >>>>> I think the MATPREALLOCATOR as a MatType is a cumbersome strange
> thing and would prefer it was just functionality that Mat provided
> directly; for example MatSetOption(mat, preallocator_mode,true);
> matsetvalues,... MatAssemblyBegin/End. Now the matrix has its proper
> nonzero structure of whatever type the user set initially, aij, baij,
> sbaij,  And the user can now use it efficiently.
> >>>>>
> >>>>> Barry
> >>>>>
> >>>>> So turning on the option just swaps out temporarily the operations
> for MatSetValues and AssemblyBegin/End to be essentially those in
> MATPREALLOCATOR. The refactorization should take almost no time and would
> be faster than trying to rig dmstag to use MATPREALLOCATOR a

Re: [petsc-users] Finite difference approximation of Jacobian

2022-01-11 Thread Patrick Sanan

Working on doing this incrementally, in progress here:
https://gitlab.com/petsc/petsc/-/merge_requests/4712

This works in 1D for AIJ matrices, assembling a matrix with a maximal
number of zero entries as dictated by the stencil width (which is intended
to be very very close to what DMDA would do if you
associated all the unknowns with a particular grid point, which is the way
DMStag largely works under the hood).

Dave, before I get into it, am I correct in my understanding that
MATPREALLOCATOR would be better here because you would avoid superfluous
zeros in the sparsity pattern,
because this routine wouldn't have to assemble the Mat returned by
DMCreateMatrix()?

If this seems like a sane way to go, I will continue to add some more tests
(in particular periodic BCs not tested yet) and add the code for 2D and 3D.



Am Mo., 13. Dez. 2021 um 20:17 Uhr schrieb Dave May :

>
>
> On Mon, 13 Dec 2021 at 20:13, Matthew Knepley  wrote:
>
>> On Mon, Dec 13, 2021 at 1:52 PM Dave May  wrote:
>>
>>> On Mon, 13 Dec 2021 at 19:29, Matthew Knepley  wrote:
>>>
 On Mon, Dec 13, 2021 at 1:16 PM Dave May 
 wrote:

>
>
> On Sat 11. Dec 2021 at 22:28, Matthew Knepley 
> wrote:
>
>> On Sat, Dec 11, 2021 at 1:58 PM Tang, Qi  wrote:
>>
>>> Hi,
>>> Does anyone have comment on finite difference coloring with DMStag?
>>> We are using DMStag and TS to evolve some nonlinear equations 
>>> implicitly.
>>> It would be helpful to have the coloring Jacobian option with that.
>>>
>>
>> Since DMStag produces the Jacobian connectivity,
>>
>
> This is incorrect.
> The DMCreateMatrix implementation for DMSTAG only sets the number of
> nonzeros (very inaccurately). It does not insert any zero values and thus
> the nonzero structure is actually not defined.
> That is why coloring doesn’t work.
>

 Ah, thanks Dave.

 Okay, we should fix that.It is perfectly possible to compute the
 nonzero pattern from the DMStag information.

>>>
>>> Agreed. The API for DMSTAG is complete enough to enable one to
>>> loop over the cells, and for all quantities defined on the cell (centre,
>>> face, vertex),
>>> insert values into the appropriate slot in the matrix.
>>> Combined with MATPREALLOCATOR, I believe a compact and readable
>>> code should be possible to write for the preallocation (cf DMDA).
>>>
>>> I think the only caveat with the approach of using all quantities
>>> defined on the cell is
>>> It may slightly over allocate depending on how the user wishes to impose
>>> the boundary condition,
>>> or slightly over allocate for says Stokes where there is no
>>> pressure-pressure coupling term.
>>>
>>
>> Yes, and would not handle higher order stencils.I think the
>> overallocating is livable for the first imeplementation.
>>
>>
> Sure, but neither does DMDA.
>
> The user always has to know what they are doing and set the stencil width
> accordingly.
> I actually had this point listed in my initial email (and the stencil
> growth issue when using FD for nonlinear problems),
> however I deleted it as all the same issue exist in DMDA and no one
> complains (at least not loudly) :D
>
>
>
>
>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Thanks,
>>> Dave
>>>
>>>
 Paging Patrick :)

   Thanks,

 Matt


> Thanks,
> Dave
>
>
> you can use -snes_fd_color_use_mat. It has many options. Here is an
>> example of us using that:
>>
>>
>> https://gitlab.com/petsc/petsc/-/blob/main/src/snes/tutorials/ex19.c#L898
>>
>>   Thanks,
>>
>>  Matt
>>
>>
>>> Thanks,
>>> Qi
>>>
>>>
>>> On Oct 15, 2021, at 3:07 PM, Jorti, Zakariae via petsc-users <
>>> petsc-users@mcs.anl.gov> wrote:
>>>
>>> Hello,
>>>
>>> Does the Jacobian approximation using coloring and finite
>>> differencing of the function evaluation work in DMStag?
>>> Thank you.
>>> Best regards,
>>>
>>> Zakariae
>>>
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which 
>> their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>

 --
 What most experimenters take for granted before they begin their
 experiments is infinitely more interesting than any results to which their
 experiments lead.
 -- Norbert Wiener

 https://www.cse.buffalo.edu/~knepley/
 

>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>>

Re: [petsc-users] Finite difference approximation of Jacobian

2021-12-16 Thread Patrick Sanan

I will work on adding this - long overdue work from me!

Am Di., 14. Dez. 2021 um 18:34 Uhr schrieb Tang, Qi :

> Dear all,
>
> Will someone be able to help with this coloring request on dmstag in the
> next few weeks? If not, we will try to fix that on our own. We really need
> this capability for both debugging as well as performance comparison vs
> analytical Jacobian/preconditioning we implemented. Thanks.
>
> Qi
> LANL
>
>
>
> On Dec 13, 2021, at 12:26 PM, Tang, Qi  wrote:
>
> “overallocating” is exactly what we can live on at the moment, as long as
> it is easier to work with coloring on dmstag.
>
> So it sounds like if we can provide a preallocated matrix with a proper
> stencil through DMCreateMatrix, then it should work with dmstag and
> coloring already. Most APIs are already there.
>
> Qi
>
>
>
> On Dec 13, 2021, at 12:13 PM, Matthew Knepley  wrote:
>
> Yes, and would not handle higher order stencils.I think the
> overallocating is livable for the first imeplementation.
>
>
>
>

Re: [petsc-users] non-manifold DMPLEX

2021-12-12 Thread Patrick Sanan

This would be a particularly useful example for testing the PETSc library,
as most applications are set on manifolds. There’s a recently-reformatted
chapter in the developers docs on how the testing works - this explains the
/* TEST */ blocks you see. If parts of these docs are confusing, that’s
very useful feedback for us!
https://petsc.org/release/developers/testing

Matthew Knepley  schrieb am Mo. 13. Dez. 2021 um 03:12:

> On Sun, Dec 12, 2021 at 4:36 PM TARDIEU Nicolas 
> wrote:
>
>> Dear Patrick and Matthew,
>>
>> Thank you very much for your answers. I am gonna try to set up such a
>> test by assigning cell types.
>> Shall I open a MR in order to contribute this example ?
>>
>
> Yes, that would be great.
>
>   Thanks,
>
> Matt
>
>
>> Regards,
>> Nicolas
>>
>> ------
>> *De :* knep...@gmail.com 
>> *Envoyé :* dimanche 12 décembre 2021 12:17
>> *À :* Patrick Sanan 
>> *Cc :* TARDIEU Nicolas ; petsc-users@mcs.anl.gov
>> 
>> *Objet :* Re: [petsc-users] non-manifold DMPLEX
>>
>> On Sun, Dec 12, 2021 at 6:11 AM Patrick Sanan 
>> wrote:
>>
>> Here you have the following "points":
>>
>> - 1 3-cell (the cube volume)
>> - 7 2-cells (the 6 faces of the cube plus the extra one)
>> - 16 1-cells  (the 12 edges of the cube, plus 3 extra ones from the extra
>> face, plus the extra edge)
>> - 11 0-cells (the 8 vertices of the cube, pus 2 extra ones from the extra
>> face, plus the extra vertex)
>>
>> You could encode your mesh as here, by directly specifying relationships
>> between these points in the Hasse diagram:
>>
>> https://petsc.org/release/docs/manual/dmplex/#representing-unstructured-grids
>> <https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpetsc.org%2Frelease%2Fdocs%2Fmanual%2Fdmplex%2F%23representing-unstructured-grids=04%7C01%7Cnicolas.tardieu%40edf.fr%7Cb7faa53e924149df02bb08d9bd610c10%7Ce242425b70fc44dc9ddfc21e304e6c80%7C1%7C0%7C637749046784571371%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000=NX0gPHyjCX3kU%2BZIAZwZQ4951FpJCdri36OzDfRoLbk%3D=0>
>>
>> Then, maybe the special relation is captured because you've defined the
>> "cone" or "support" for each "point", which tells you about the local
>> topology everywhere. E.g. to take the simpler case, three of the faces have
>> the yellow edge in their "cone", or equivalently the yellow edge has those
>> three faces in its "support".
>>
>>
>> This is correct. I can help you make this if you want. I think if you
>> assign cell types, you can even get Plex to automatically interpolate.
>>
>> Note that with this kind of mesh, algorithms which assume a uniform cell
>> dimension will break, but I am guessing you would not
>> be interested in those anyway.
>>
>>   Thanks,
>>
>> Matt
>>
>>
>> Am Fr., 10. Dez. 2021 um 17:04 Uhr schrieb TARDIEU Nicolas via
>> petsc-users :
>>
>> Dear PETSc Team,
>>
>> Following a previous discussion on the mailing list, I'd like to
>> experiment with DMPLEX with a very simple non-manifold mesh as shown in the
>> attached picture : a cube connected to a square by an edge and to an edge
>> by a point.
>> I have read some of the papers that Matthew et al. have written, but I
>> must admit that I do not see how to start...
>> I see how the define the different elements but I do not see how to
>> specify the special relationship between the cube and the square and
>> between the cube and the edge.
>> Once it will have been set correctly, what I am hoping is to be able to
>> use all the nice features of the DMPLEX object.
>>
>> Best regards,
>> Nicolas
>>
>>
>> Ce message et toutes les pièces jointes (ci-après le 'Message') sont
>> établis à l'intention exclusive des destinataires et les informations qui y
>> figurent sont strictement confidentielles. Toute utilisation de ce Message
>> non conforme à sa destination, toute diffusion ou toute publication totale
>> ou partielle, est interdite sauf autorisation expresse.
>>
>> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de
>> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
>> partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de
>> votre système, ainsi que toutes ses copies, et de n'en garder aucune trace
>> sur quelque support que ce soit. Nous vous remercions également d'en
>> avertir immédiateme

Re: [petsc-users] non-manifold DMPLEX

2021-12-12 Thread Patrick Sanan

Here you have the following "points":

- 1 3-cell (the cube volume)
- 7 2-cells (the 6 faces of the cube plus the extra one)
- 16 1-cells  (the 12 edges of the cube, plus 3 extra ones from the extra
face, plus the extra edge)
- 11 0-cells (the 8 vertices of the cube, pus 2 extra ones from the extra
face, plus the extra vertex)

You could encode your mesh as here, by directly specifying relationships
between these points in the Hasse diagram:
https://petsc.org/release/docs/manual/dmplex/#representing-unstructured-grids

Then, maybe the special relation is captured because you've defined the
"cone" or "support" for each "point", which tells you about the local
topology everywhere. E.g. to take the simpler case, three of the faces have
the yellow edge in their "cone", or equivalently the yellow edge has those
three faces in its "support".



Am Fr., 10. Dez. 2021 um 17:04 Uhr schrieb TARDIEU Nicolas via petsc-users <
petsc-users@mcs.anl.gov>:

> Dear PETSc Team,
>
> Following a previous discussion on the mailing list, I'd like to
> experiment with DMPLEX with a very simple non-manifold mesh as shown in the
> attached picture : a cube connected to a square by an edge and to an edge
> by a point.
> I have read some of the papers that Matthew et al. have written, but I
> must admit that I do not see how to start...
> I see how the define the different elements but I do not see how to
> specify the special relationship between the cube and the square and
> between the cube and the edge.
> Once it will have been set correctly, what I am hoping is to be able to
> use all the nice features of the DMPLEX object.
>
> Best regards,
> Nicolas
>
>
> Ce message et toutes les pièces jointes (ci-après le 'Message') sont
> établis à l'intention exclusive des destinataires et les informations qui y
> figurent sont strictement confidentielles. Toute utilisation de ce Message
> non conforme à sa destination, toute diffusion ou toute publication totale
> ou partielle, est interdite sauf autorisation expresse.
>
> Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de
> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
> partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de
> votre système, ainsi que toutes ses copies, et de n'en garder aucune trace
> sur quelque support que ce soit. Nous vous remercions également d'en
> avertir immédiatement l'expéditeur par retour du message.
>
> Il est impossible de garantir que les communications par messagerie
> électronique arrivent en temps utile, sont sécurisées ou dénuées de toute
> erreur ou virus.
> 
>
> This message and any attachments (the 'Message') are intended solely for
> the addressees. The information contained in this Message is confidential.
> Any use of information contained in this Message not in accord with its
> purpose, any dissemination or disclosure, either whole or partial, is
> prohibited except formal approval.
>
> If you are not the addressee, you may not copy, forward, disclose or use
> any part of it. If you have received this message in error, please delete
> it and all copies from your system and notify the sender immediately by
> return message.
>
> E-mail communication cannot be guaranteed to be timely secure, error or
> virus-free.
>

Re: [petsc-users] CV_CONV_FAILURE with TSSUNDIALS in PETSc

2021-12-07 Thread Patrick Sanan

Note that if you have a call to TSSetFromOptions() in your code (again, see
TS ex19 in src/ts/tutorials/ex19.c) then you can supply command line
options and see a lot of information about the timestepper. If you run the
example with the -help argument you'll get a big list of options which may
apply. Here's an example with ex19 which sets a small initial timestep and
then lets TSBDF adapt:

$ ./ex19 -ts_view -ts_monitor -ts_type bdf -ts_initial_timestep 0.0001
-ts_dt 1e-8
0 TS dt 1e-08 time 0.
1 TS dt 2e-08 time 1e-08
2 TS dt 4e-08 time 3e-08
3 TS dt 8e-08 time 7e-08
4 TS dt 1.6e-07 time 1.5e-07
5 TS dt 3.2e-07 time 3.1e-07
6 TS dt 6.4e-07 time 6.3e-07
7 TS dt 1.28e-06 time 1.27e-06
8 TS dt 2.56e-06 time 2.55e-06
9 TS dt 5.12e-06 time 5.11e-06
10 TS dt 1.024e-05 time 1.023e-05
11 TS dt 2.048e-05 time 2.047e-05
12 TS dt 4.096e-05 time 4.095e-05
13 TS dt 8.192e-05 time 8.191e-05
14 TS dt 0.00016384 time 0.00016383
15 TS dt 0.00032768 time 0.00032767
16 TS dt 0.00065536 time 0.00065535
17 TS dt 0.00131072 time 0.00131071
18 TS dt 0.00262144 time 0.00262143
19 TS dt 0.00524288 time 0.00524287
20 TS dt 0.0104858 time 0.0104858
21 TS dt 0.0209715 time 0.0209715
22 TS dt 0.041943 time 0.041943
23 TS dt 0.0838861 time 0.0838861
24 TS dt 0.167772 time 0.167772
25 TS dt 0.177781 time 0.335544
26 TS dt 0.14402 time 0.482242
27 TS dt 0.130984 time 0.626262
TS Object: 1 MPI processes
  type: bdf
Order=2
  maximum time=0.5
  total number of I function evaluations=120
  total number of I Jacobian evaluations=91
  total number of nonlinear solver iterations=91
  total number of linear solver iterations=91
  total number of nonlinear solve failures=0
  total number of rejected steps=1
  using relative error tolerance of 0.0001,   using absolute error
tolerance of 0.0001
  TSAdapt Object: 1 MPI processes
type: basic
safety factor 0.9
extra safety factor after step rejection 0.5
clip fastest increase 2.
clip fastest decrease 0.1
maximum allowed timestep 1e+20
minimum allowed timestep 1e-20
maximum solution absolute value to be ignored -1.
  SNES Object: 1 MPI processes
type: newtonls
maximum iterations=50, maximum function evaluations=1
tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
total number of linear solver iterations=13
total number of function evaluations=14
norm schedule ALWAYS
SNESLineSearch Object: 1 MPI processes
  type: bt
interpolation: cubic
alpha=1.00e-04
  maxstep=1.00e+08, minlambda=1.00e-12
  tolerances: relative=1.00e-08, absolute=1.00e-15,
lambda=1.00e-08
  maximum iterations=40
KSP Object: 1 MPI processes
  type: gmres
restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
happy breakdown tolerance 1e-30
  maximum iterations=1, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=1.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: ilu
out-of-place factorization
0 levels of fill
tolerance for zero pivot 2.22045e-14
matrix ordering: natural
factor fill ratio given 1., needed 1.
  Factored matrix follows:
Mat Object: 1 MPI processes
  type: seqaij
  rows=2, cols=2
  package used to perform factorization: petsc
  total: nonzeros=4, allocated nonzeros=4
using I-node routines: found 1 nodes, limit used is 5
  linear system matrix = precond matrix:
  Mat Object: 1 MPI processes
type: seqaij
rows=2, cols=2
total: nonzeros=4, allocated nonzeros=4
total number of mallocs used during MatSetValues calls=0
  using I-node routines: found 1 nodes, limit used is 5
steps  27, ftime 0.626262
Vec Object: 1 MPI processes
  type: seq
-0.635304
-1.98947


As a bit of an aside, we actually just merged an option for TSSUNDIALS into
the main branch which might possibly give extra data here. If you supply
the -ts_sundials_use_dense option, it will perform a dense linear solve
instead of an iterative one. This is of course inefficient in most cases
and only works in serial, but it can be very helpful to eliminate the
possibility that the issue arises from the parameters of the iterative
linear solve.



Am Mo., 6. Dez. 2021 um 20:30 Uhr schrieb Matthew Knepley :

> On Mon, Dec 6, 2021 at 2:04 PM Sanjoy Kumar Mazumder 
> wrote:
>
>> Thank you for your suggestion. I will look into the documentation of
>> ts_type bdf and try implementing the same. However, it seems like there is
>> not much example on TSBDF available. If you can kindly share with me an
>> example on the implementation of TSBDF it would be helpful.
>>
>
> If your problem is defined using IFunction, then it works like any other
> implicit solver. You can look at any example

Re: [petsc-users] Why PetscDestroy global collective semantics?

2021-10-24 Thread Patrick Sanan

I think Jeremy (cc‘d) has also been thinking about this in the context of
PETSc.jl

Stefano Zampini  schrieb am So. 24. Okt. 2021 um
07:52:

> Non-deterministic garbage collection is an issue from Python too, and
> firedrake folks are also working on that.
>
> We may consider deferring all calls to MPI_Comm_free done on communicators
> with 1 as ref count (i.e., the call will actually wipe out some internal
> MPI data) in a collective call that can be either run by the user (on
> PETSC_COMM_WORLD), or at PetscFinalize() stage.
> I.e., something like that
>
> #define MPI_Comm_free(comm) PutCommInAList(comm)
>
> Comm creation is collective by definition, and thus collectiveness of the
> order of the destruction can be easily enforced.
> I don't see problems with 3rd party libraries using comms, since we always
> duplicate the comm we passed them
>
> Lawrence, do you think this may help you?
>
> Thanks
> Stefano
>
> Il giorno dom 24 ott 2021 alle ore 05:58 Barry Smith 
> ha scritto:
>
>>
>>   Ahh, this makes perfect sense.
>>
>>   The code for PetscObjectRegisterDestroy() and the actual destruction
>> (called in PetscFinalize()) is very simply and can be found in
>> src/sys/objects/destroy.c PetscObjectRegisterDestroy(), 
>> PetscObjectRegisterDestroyAll().
>>
>>   You could easily maintain a new array
>> like PetscObjectRegisterGCDestroy_Objects[] and add objects
>> with PetscObjectRegisterGCDestroy() and then destroy them
>> with PetscObjectRegisterDestroyGCAll(). The only tricky part is that you
>> have to have, in the context of your Julia MPI, make sure
>> that PetscObjectRegisterDestroyGCAll() is called collectively over all the
>> MPI ranks (that is it has to be called where all the ranks have made the
>> same progress on MPI communication) that have registered objects to
>> destroy, generally PETSC_COMM_ALL.  We would be happy to incorporate such a
>> system into the PETSc source with a merge request.
>>
>>   Barry
>>
>> On Oct 23, 2021, at 10:40 PM, Alberto F. Martín 
>> wrote:
>>
>> Thanks all for your very insightful answers.
>>
>> We are leveraging PETSc from Julia in a parallel distributed memory
>> context (several MPI tasks running the Julia REPL each).
>>
>> Julia uses Garbage Collection (GC), and we would like to destroy the
>> PETSc objects automatically when the GC decides so along the simulation.
>>
>> In this context, we cannot guarantee deterministic destruction on all MPI
>> tasks as the GC decisions are local to each task, no global semantics
>> guaranteed.
>>
>> As far as I understand from your answers, there seems to be the
>> possibility to defer the destruction of objects till points in the parallel
>> program in which you can guarantee collective semantics, correct? If yes I
>> guess that this may occur at any point in the simulation, not necessarily
>> at shut down via PetscFinalize(), right?
>>
>> Best regards,
>>
>>  Alberto.
>>
>>
>> On 24/10/21 1:10 am, Jacob Faibussowitsch wrote:
>>
>> Depending on the use-case you may also find PetscObjectRegisterDestroy()
>> useful. If you can’t guarantee your PetscObjectDestroy() calls are
>> collective, but have some other collective section you may call it then to
>> punt the destruction of your object to PetscFinalize() which is guaranteed
>> to be collective.
>>
>>
>> https://petsc.org/main/docs/manualpages/Sys/PetscObjectRegisterDestroy.html
>>
>> Best regards,
>>
>> Jacob Faibussowitsch
>> (Jacob Fai - booss - oh - vitch)
>>
>> On Oct 22, 2021, at 23:33, Jed Brown  wrote:
>>
>> Junchao Zhang  writes:
>>
>> On Fri, Oct 22, 2021 at 9:13 PM Barry Smith  wrote:
>>
>>
>>  One technical reason is that PetscHeaderDestroy_Private() may call
>> PetscCommDestroy() which may call MPI_Comm_free() which is defined by the
>> standard to be collective. Though PETSc tries to limit its use of new MPI
>> communicators (for example generally many objects shared the same
>> communicator) if we did not free those we no longer need when destroying
>> objects we could run out.
>>
>> PetscCommDestroy() might call MPI_Comm_free() , but it is very unlikely.
>> Petsc uses reference counting on communicators, so in PetscCommDestroy(),
>> it likely just decreases the count. In other words, PetscCommDestroy() is
>> cheap and in effect not collective.
>>
>>
>> Unless it's the last reference to a given communicator, which is a
>> risky/difficult thing for a user to guarantee and the consequences are
>> potentially dire (deadlock being way worse than a crash) when the user's
>> intent is to relax ordering for destruction.
>>
>> Alberto, what is the use case in which deterministic destruction is
>> problematic? If you relax it for individual objects, is there a place you
>> can be collective to collect any stale communicators?
>>
>>
>> --
>> Alberto F. Martín-Huertas
>> Senior Researcher, PhD. Computational Science
>> Centre Internacional de Mètodes Numèrics a l'Enginyeria (CIMNE)
>> Parc Mediterrani de la Tecnologia, UPCEsteve Terradas 5, Building C3, Office 
>>

Re: [petsc-users] DMView and DMLoad

2021-09-20 Thread Patrick Sanan

Thanks for reporting that! There are still some things in the "classic"
docs build that we could make more robust but it's not clear if we should
devote the energy to that or to continuing to replace those processes with
new ones which are  better integrated with the Sphinx build.

MR that should hopefully at least partially fix this particular issue:
https://gitlab.com/petsc/petsc/-/merge_requests/4331
General issue: https://gitlab.com/petsc/petsc/-/issues/279

Am Mo., 20. Sept. 2021 um 15:38 Uhr schrieb Lawrence Mitchell :

> Dear Sergio,
>
> (Added petsc-users back to cc),
>
> > On 20 Sep 2021, at 14:08, sergio.bengoec...@ovgu.de wrote:
> >
> > Dear Lawrence,
> >
> > thanks for the HDF5 saving and loading example.
> >
> > In the documentation you sent (
> https://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5)
> the link to a more comprehensive example (DMPlex Tutorial ex12) is not
> working.
>
> Hmm, I'm not in charge of how the documentation gets built.
>
> @Patrick, if I look here:
> https://petsc.org/main/src/dm/impls/plex/tutorials/, I don't see ex12.c
> (even though it is here
> https://gitlab.com/petsc/petsc/-/tree/main/src/dm/impls/plex/tutorials)
>
> The relevant link in the docs goes to
> https://petsc.org/main/docs/src/dm/impls/plex/tutorials/ex12.c.html
> (which is 404), I suppose it should be main/src/... (not main/docs/src...)
>
> > I am afraid I would need to see the whole example to make our case work.
> If I could have access the completed source code of that example would be
> of a great help.
>
> You can find the relevant example in the PETSc source tree
> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tutorials/ex12.c
>
> Lawrence

[petsc-users] Postdoctoral position at ETH Zurich: Geodynamics / HPC / Julia

2021-08-31 Thread Patrick Sanan

The Geophysical Fluid Dynamics group at ETH Zurich (Switzerland) is seeking
a postdoctoral appointee to work for about 2.5 years on an ambitious
project involving developing a Julia-based library for GPU-accelerated
multiphysics solvers based on pseudotransient relaxation.

Of particular interest for this audience might be that a major component of
the proposed work is to make these solvers available via PETSc (as a SNES
implementation), thus exposing them for use within a host of existing HPC
applications, including those involved in this specific project.

We'll accept applications until the position is filled, but for full
consideration please apply before October 1, 2021.

Full information is in the ad at the following link, and please feel free
to contact me directly!
https://github.com/psanan/gpu4geo_postdoc_ad/

Best,
Patrick

Re: [petsc-users] Question about MatGetSubMatrix

2021-07-26 Thread Patrick Sanan



> Am 24.07.2021 um 07:32 schrieb Barry Smith :
> 
> 
>   Qi,
> 
> MatSetLocalGlobalToGlobalMapping() is the routine that provides the 
> information to the matrix so that one can use a "local" ordering and it will 
> get automatically translated to the parallel PETSc ordering on the parallel 
> matrix. Generically yes it includes the ghost point region on each MPI rank. 
> Some DM's provide this information automatically; for DMDA it is pretty 
> simple, for DMStag a bit more complicated. 
> 
The local-to-global mapping should get automatically attached to the Mat if you 
use a DM to create an operator/Mat, with DMCreateMatrix(). Thus, this is the 
recommended way to proceed!

https://petsc.org/release/docs/manualpages/DM/DMCreateMatrix.html

>   You are correct, the previous manual page for MatZeroRowsLocal() 
> incorrectly states "global" rows from 16+ years ago, likely a cut and paste 
> error. It looks like it has been fixed in the main PETSc repository,
> 
>   Barry
> 
> 
>> On Jul 23, 2021, at 10:50 AM, Tang, Qi > <mailto:tan...@msu.edu>> wrote:
>> 
>> How can we use MatZeroRowsLocal? Is there any doc for the local index vs 
>> global index for a matrix?
>> 
>> I am asking because we are not sure about how to prepare a proper local 
>> index. Does the index include the ghost point region or not?
>> 
>> Qi
>> 
>> 
>> 
>> 
>>> On Jul 20, 2021, at 9:30 PM, Tang, Qi >> <mailto:tan...@msu.edu>> wrote:
>>> 
>>> Hi,
>>> 
>>> Now I think the DMStagStencilToIndexLocal provides the local index for 
>>> given (stencil) positions. How can we use that local index information to 
>>> eliminate the rows? 
>>> 
>>> Is the following code possible:
>>> 
>>> MatSetLocalToGlobalMapping(…);
>>> If (is_boundary){
>>> PetscInt ix;
>>> DMStagStencilToIndexLocal(…, );
>>> MatZeroRowsLocal(… , …);
>>> }
>>> 
>>> The comment of MatZeroRowsLocal said "rows - the global row indices”. But 
>>> this seems inconsistent with its name, so I am confused.
>>> 
>>> Thanks,
>>> Qi
>>> 
>>> 
>>>> On Jul 20, 2021, at 2:18 AM, Patrick Sanan >>> <mailto:patrick.sa...@gmail.com>> wrote:
>>>> 
>>>> Hi Qi -
>>>> 
>>>> I just opened a PR to make DMStagStencilToIndexLocal() public
>>>> https://gitlab.com/petsc/petsc/-/merge_requests/4180 
>>>> <https://urldefense.com/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/4180__;!!HXCxUKc!mAeQntyQFk-B3KHmJJWuOGKXTfKh8A-uC-v9JFZHW-VbJT7ejxWH7LlaFNM-Zg$>
>>>> 
>>>> (Sorry for my inattention - I think I may have missed some communications 
>>>> in processing the flood of PETSc emails too quickly - I still plan to get 
>>>> some more automatic DMStag fieldsplit capabilities into main, if it's not 
>>>> too late).
>>>> 
>>>>> Am 20.07.2021 um 02:47 schrieb Tang, Qi >>>> <mailto:tan...@msu.edu>>:
>>>>> 
>>>>> Hi,
>>>>> As a part of implementing this process by ourself, we would like to 
>>>>> eliminate boundary dofs. By reading DMStag code, we guess we can use
>>>>> DMStagStencilToIndexLocal
>>>>> MatZeroRowsLocal
>>>>> 
>>>>> We note that DMStagStencilToIndexLocal is not explicitly defined in the 
>>>>> header file. Is this function ready to use? And will we be able to 
>>>>> eliminate the dofs using the above functions?
>>>>> 
>>>>> Thanks,
>>>>> Qi
>>>>> 
>>>>> 
>>>>> 
>>>>>> On Jul 16, 2021, at 8:17 PM, Barry Smith >>>>> <mailto:bsm...@petsc.dev>> wrote:
>>>>>> 
>>>>>> 
>>>>>>   Zakariae,
>>>>>> 
>>>>>> MatGetSubMatrix() was removed a long time ago, the routine is now 
>>>>>> MatCreateSubMatrix() but it does not work in way you had hoped. There is 
>>>>>> currently no mechanism to move values you put into the sub matrix back 
>>>>>> into the original larger matrix (though perhaps there should be?). 
>>>>>> 
>>>>>> Please look at MatCreateSubMatrixVirtual() and also MatCreateNest() 
>>>>>> to see if either of those approaches satisfy your needs. 
>>>>>> 
>>>>>> Please let us know if there are extensions that would be useful for 
>>>>>> you to accomplish what you need. 
>>>>>> 
>>>>>>   Barry
>>>>>> 
>>>>>> 
>>>>>>> On Jul 16, 2021, at 7:45 PM, Jorti, Zakariae via petsc-users 
>>>>>>> mailto:petsc-users@mcs.anl.gov>> wrote:
>>>>>>> 
>>>>>>> Hello,
>>>>>>> 
>>>>>>> I have a matrix A = [A00 , A01 ; A10, A11]. 
>>>>>>> I extract the submatrix A11 with MatGetSubMatrix.
>>>>>>> I only know the global IS is1 and is2, so to get A11 I call: 
>>>>>>> MatGetSubMatrix(A,is2,is2,MAT_INITIAL_MATRIX,);
>>>>>>> I want to modify A11 and update the changes on the global matrix A but 
>>>>>>> I could not find any MatRestoreSubMatrix routine.
>>>>>>> Is there something similar to VecGetSubVector and VecRestoreSubVector 
>>>>>>> for matrices that uses only global indices?
>>>>>>> Many thanks.
>>>>>>> Best regards,
>>>>>>> 
>>>>>>> Zakariae 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>

Re: [petsc-users] Question about MatGetSubMatrix

2021-07-21 Thread Patrick Sanan



> Am 21.07.2021 um 05:30 schrieb Tang, Qi :
> 
> Hi,
> 
> Now I think the DMStagStencilToIndexLocal provides the local index for given 
> (stencil) positions. How can we use that local index information to eliminate 
> the rows? 
> 
> Is the following code possible:
> 
> MatSetLocalToGlobalMapping(…);
> If (is_boundary){
> PetscInt ix;
> DMStagStencilToIndexLocal(…, );
> MatZeroRowsLocal(… , …);
> }
> 
> The comment of MatZeroRowsLocal said "rows - the global row indices”. But 
> this seems inconsistent with its name, so I am confused.
That was indeed a typo on the man page, fix here:
https://gitlab.com/petsc/petsc/-/merge_requests/4183 
<https://gitlab.com/petsc/petsc/-/merge_requests/4183>

> 
> Thanks,
> Qi
> 
> 
>> On Jul 20, 2021, at 2:18 AM, Patrick Sanan > <mailto:patrick.sa...@gmail.com>> wrote:
>> 
>> Hi Qi -
>> 
>> I just opened a PR to make DMStagStencilToIndexLocal() public
>> https://gitlab.com/petsc/petsc/-/merge_requests/4180 
>> <https://urldefense.com/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/4180__;!!HXCxUKc!mAeQntyQFk-B3KHmJJWuOGKXTfKh8A-uC-v9JFZHW-VbJT7ejxWH7LlaFNM-Zg$>
>> 
>> (Sorry for my inattention - I think I may have missed some communications in 
>> processing the flood of PETSc emails too quickly - I still plan to get some 
>> more automatic DMStag fieldsplit capabilities into main, if it's not too 
>> late).
>> 
>>> Am 20.07.2021 um 02:47 schrieb Tang, Qi >> <mailto:tan...@msu.edu>>:
>>> 
>>> Hi,
>>> As a part of implementing this process by ourself, we would like to 
>>> eliminate boundary dofs. By reading DMStag code, we guess we can use
>>> DMStagStencilToIndexLocal
>>> MatZeroRowsLocal
>>> 
>>> We note that DMStagStencilToIndexLocal is not explicitly defined in the 
>>> header file. Is this function ready to use? And will we be able to 
>>> eliminate the dofs using the above functions?
>>> 
>>> Thanks,
>>> Qi
>>> 
>>> 
>>> 
>>>> On Jul 16, 2021, at 8:17 PM, Barry Smith >>> <mailto:bsm...@petsc.dev>> wrote:
>>>> 
>>>> 
>>>>   Zakariae,
>>>> 
>>>> MatGetSubMatrix() was removed a long time ago, the routine is now 
>>>> MatCreateSubMatrix() but it does not work in way you had hoped. There is 
>>>> currently no mechanism to move values you put into the sub matrix back 
>>>> into the original larger matrix (though perhaps there should be?). 
>>>> 
>>>> Please look at MatCreateSubMatrixVirtual() and also MatCreateNest() to 
>>>> see if either of those approaches satisfy your needs. 
>>>> 
>>>> Please let us know if there are extensions that would be useful for 
>>>> you to accomplish what you need. 
>>>> 
>>>>   Barry
>>>> 
>>>> 
>>>>> On Jul 16, 2021, at 7:45 PM, Jorti, Zakariae via petsc-users 
>>>>> mailto:petsc-users@mcs.anl.gov>> wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> I have a matrix A = [A00 , A01 ; A10, A11]. 
>>>>> I extract the submatrix A11 with MatGetSubMatrix.
>>>>> I only know the global IS is1 and is2, so to get A11 I call: 
>>>>> MatGetSubMatrix(A,is2,is2,MAT_INITIAL_MATRIX,);
>>>>> I want to modify A11 and update the changes on the global matrix A but I 
>>>>> could not find any MatRestoreSubMatrix routine.
>>>>> Is there something similar to VecGetSubVector and VecRestoreSubVector for 
>>>>> matrices that uses only global indices?
>>>>> Many thanks.
>>>>> Best regards,
>>>>> 
>>>>> Zakariae 
>>>> 
>>> 
>> 
>

Re: [petsc-users] Question about MatGetSubMatrix

2021-07-20 Thread Patrick Sanan

Hi Qi -

I just opened a PR to make DMStagStencilToIndexLocal() public
https://gitlab.com/petsc/petsc/-/merge_requests/4180

(Sorry for my inattention - I think I may have missed some communications in 
processing the flood of PETSc emails too quickly - I still plan to get some 
more automatic DMStag fieldsplit capabilities into main, if it's not too late).

> Am 20.07.2021 um 02:47 schrieb Tang, Qi :
> 
> Hi,
> As a part of implementing this process by ourself, we would like to eliminate 
> boundary dofs. By reading DMStag code, we guess we can use
> DMStagStencilToIndexLocal
> MatZeroRowsLocal
> 
> We note that DMStagStencilToIndexLocal is not explicitly defined in the 
> header file. Is this function ready to use? And will we be able to eliminate 
> the dofs using the above functions?
> 
> Thanks,
> Qi
> 
> 
> 
>> On Jul 16, 2021, at 8:17 PM, Barry Smith > > wrote:
>> 
>> 
>>   Zakariae,
>> 
>> MatGetSubMatrix() was removed a long time ago, the routine is now 
>> MatCreateSubMatrix() but it does not work in way you had hoped. There is 
>> currently no mechanism to move values you put into the sub matrix back into 
>> the original larger matrix (though perhaps there should be?). 
>> 
>> Please look at MatCreateSubMatrixVirtual() and also MatCreateNest() to 
>> see if either of those approaches satisfy your needs. 
>> 
>> Please let us know if there are extensions that would be useful for you 
>> to accomplish what you need. 
>> 
>>   Barry
>> 
>> 
>>> On Jul 16, 2021, at 7:45 PM, Jorti, Zakariae via petsc-users 
>>> mailto:petsc-users@mcs.anl.gov>> wrote:
>>> 
>>> Hello,
>>> 
>>> I have a matrix A = [A00 , A01 ; A10, A11]. 
>>> I extract the submatrix A11 with MatGetSubMatrix.
>>> I only know the global IS is1 and is2, so to get A11 I call: 
>>> MatGetSubMatrix(A,is2,is2,MAT_INITIAL_MATRIX,);
>>> I want to modify A11 and update the changes on the global matrix A but I 
>>> could not find any MatRestoreSubMatrix routine.
>>> Is there something similar to VecGetSubVector and VecRestoreSubVector for 
>>> matrices that uses only global indices?
>>> Many thanks.
>>> Best regards,
>>> 
>>> Zakariae 
>> 
>

Re: [petsc-users] PETSc with Julia Binary Builder

2021-07-02 Thread Patrick Sanan

As you mention in [4], the proximate cause of the configure failure is this 
link error [8]:

Naively, that looks like a problem to be resolved at the level of the C++ 
compiler and MPI.

Unless there are wrinkles of this build process that I don't understand 
(likely), this [6] looks non-standard to me:

includedir="${prefix}/include"
...
./configure --prefix=${prefix} \
...
-with-mpi-include="${includedir}" \
...


Is it possible to configure using  --with-mpi-dir, instead of the separate 
--with-mpi-include and --with-mpi-lib commands? 


As an aside, maybe Satish can say more, but I'm not sure if it's advisable to 
override variables in the make command [7].

[8] 
https://gist.github.com/jkozdon/c161fb15f2df23c3fbc0a5a095887ef8#file-configure-log-L7795
[6] 
https://gist.github.com/jkozdon/c161fb15f2df23c3fbc0a5a095887ef8#file-build_tarballs-jl-L45
[7] 
https://gist.github.com/jkozdon/c161fb15f2df23c3fbc0a5a095887ef8#file-build_tarballs-jl-L55


> Am 02.07.2021 um 06:25 schrieb Kozdon, Jeremy (CIV) :
> 
> I have been talking with Boris Kaus and Patrick Sanan about trying to revive 
> the Julia PETSc interface wrappers. One of the first things to get going is 
> to use Julia's binary builder [1] to wrap more scalar, real, and int type 
> builds of the PETSc library; the current distribution is just Real, double, 
> Int32. I've been working on a PR for this [2] but have been running into some 
> build issues on some architectures [3].
> 
> I doubt that anyone here is an expert with Julia's binary builder system, but 
> I was wondering if anyone who is better with the PETSc build system can see 
> anything obvious from the configure.log [4] that might help me sort out 
> what's going on.
> 
> This exact script worked on 2020-08-20 [5] to build the libraries, se 
> something has obviously changed with either the Julia build system and/or one 
> (or more!) of the dependency binaries.
> 
> For those that don't know, Julia's binary builder system essentially allows 
> users to download binaries directly from the web for any system that the 
> Julia Programing language distributes binaries for. So a (desktop) user can 
> get MPI, PETSc, etc. without the headache of having to build anything from 
> scratch; obviously on clusters you would still want to use system MPIs and 
> what not.
> 
> 
> 
> [1] https://github.com/JuliaPackaging/BinaryBuilder.jl
> [2] https://github.com/JuliaPackaging/Yggdrasil/pull/3249
> [3] 
> https://github.com/JuliaPackaging/Yggdrasil/pull/3249#issuecomment-872698681
> [4] 
> https://gist.github.com/jkozdon/c161fb15f2df23c3fbc0a5a095887ef8#file-configure-log
> [5] 
> https://github.com/JuliaBinaryWrappers/PETSc_jll.jl/releases/tag/PETSc-v3.13.4%2B0

Re: [petsc-users] Question about PCFieldSplit

2021-06-22 Thread Patrick Sanan

Hi Zakariae -

The usual way to do this is to define an IS (index set) with the degrees of 
freedom of interest for the rows, and another one for the columns, and then use 
MatCreateSubmatrix [1] .

There's not a particularly convenient way to create an IS with the degrees of 
freedom corresponding to a particular "stratum" (i.e. elements, faces, edges, 
or vertices) of a DMStag, but fortunately I believe we have some code to do 
exactly this in a development branch. 

I'll track it down and see if it can quickly be added to the main branch.


[1]: https://petsc.org/release/docs/manualpages/Mat/MatCreateSubMatrix.html

> Am 22.06.2021 um 22:29 schrieb Jorti, Zakariae :
> 
> Hello,
> 
> I am working on DMStag and I have one dof on vertices (let us call it V), one 
> dof on edges (let us call it E), one dof on faces ((let us call it F)) and 
> one dof on cells (let us call it C).
> I build a matrix on this DM, and I was wondering if there was a way to get 
> blocks (or sub matrices) of this matrix corresponding to specific degrees of 
> freedom, for example rows corresponding to V dofs and columns corresponding 
> to E dofs.
> I already asked this question before and the answer I got was I could call 
> PCFieldSplitSetDetectSaddlePoint with the diagonal entries being of the 
> matrix being zero or nonzero. 
> That worked well. Nonetheless, I am curious to know if there was another 
> alternative that does not require creating a dummy matrix with appropriate 
> diagonal entries and solving a dummy linear system with this matrix to define 
> the splits. 
> 
> Many thanks.
> 
> Best regards,
> 
> Zakariae
> From: petsc-users  on behalf of Tang, Qi 
> 
> Sent: Sunday, April 18, 2021 11:51:59 PM
> To: Patrick Sanan
> Cc: petsc-users@mcs.anl.gov; Tang, Xianzhu
> Subject: [EXTERNAL] Re: [petsc-users] Question about PCFieldSplit
>  
> Thanks a lot, Patrick. We appreciate your help.
> 
> Qi
> 
> 
> 
>> On Apr 18, 2021, at 11:30 PM, Patrick Sanan > <mailto:patrick.sa...@gmail.com>> wrote:
>> 
>> We have this functionality in a branch, which I'm working on cleaning up to 
>> get to master. It doesn't use PETScSection. Sorry about the delay!
>> 
>> You can only use PCFieldSplitSetDetectSaddlePoint when your diagonal entries 
>> being zero or non-zero defines the splits correctly. 
>> 
>>> Am 17.04.2021 um 21:09 schrieb Matthew Knepley >> <mailto:knep...@gmail.com>>:
>>> 
>>> On Fri, Apr 16, 2021 at 8:39 PM Jorti, Zakariae via petsc-users 
>>> mailto:petsc-users@mcs.anl.gov>> wrote:
>>> Hello,
>>> 
>>> I have a DMStag grid with one dof on each edge and face center. 
>>> I want to use a PCFieldSplit preconditioner on a Jacobian matrix that I 
>>> assume is already split but I am not sure how to determine the fields. 
>>> In the DMStag examples (ex2.c and ex3.c), the function 
>>> PCFieldSplitSetDetectSaddlePoint is used to determine those fields based on 
>>> zero diagonal entries. In my case, I have a Jacobian matrix that does not 
>>> have zero diagonal entries. 
>>> Can I use that PCFieldSplitSetDetectSaddlePoint in this case? 
>>> If not, how should I do? 
>>> Should I do like this example 
>>> (https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html
>>>  
>>> <https://urldefense.com/v3/__https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html__;!!HXCxUKc!jbBwV2h9luOW4dtBcNh6n_W1ULQnSVeXpxl0Ef1752s4Hlef-nC2JcnksFSO3Q$>):
>>>  
>>> const PetscInt Bfields[1] = {0},Efields[1] = {1};
>>> KSPGetPC(ksp,);
>>> PCFieldSplitSetBlockSize(pc,2);
>>> PCFieldSplitSetFields(pc,"B",1,Bfields,Bfields); 
>>> PCFieldSplitSetFields(pc,"E",1,Efields,Efields); 
>>> where my B unknowns are defined on face centers and E unknowns are defined 
>>> on edge centers?
>>> That will not work.That interface only works for colocated fields that you 
>>> get from DMDA.
>>> 
>>> Patrick, does DMSTAG use PetscSection? Then the field split would be 
>>> automatically calculated. If not, does it maintain the
>>> field division so that it could be given to PCFIELDSPLIT as ISes?
>>> 
>>>   Thanks,
>>> 
>>>  Matt
>>> One last thing, I do not know which field comes first. Is it the one 
>>> defined for face dofs or edge dofs.
>>> 
>>> Thank you.
>>> Best regards,
>>> 
>>> Zakariae
>>> 
>>> 
>>> 
>>> -- 
>>> What most experimenters take for granted before they begin their 
>>> experiments is infinitely more interesting than any results to which their 
>>> experiments lead.
>>> -- Norbert Wiener
>>> 
>>> https://www.cse.buffalo.edu/~knepley/ 
>>> <https://urldefense.com/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!HXCxUKc!jbBwV2h9luOW4dtBcNh6n_W1ULQnSVeXpxl0Ef1752s4Hlef-nC2JcmGgSwfag$>

Re: [petsc-users] Question about PCFieldSplit

2021-04-20 Thread Patrick Sanan

That's great - if you run into issues that I can help with in the meantime, let 
me know. Even if you can't share the code, maybe you could share a screen or 
something like that. This would undoubtedly be helpful for me in seeing how 
people use the code "in the wild" : )

> Am 20.04.2021 um 08:29 schrieb Tang, Qi :
> 
> Patrick,
> We were able to get field split working in DMStag with the dummy solver 
> approach I described. Let us know when you add the capability and we will be 
> happy to test it. Meanwhile, please take your time to merge it.
> 
> Thanks,
> Qi
> 
> 
> 
>> On Apr 18, 2021, at 11:51 PM, Tang, Qi > <mailto:tan...@msu.edu>> wrote:
>> 
>> Thanks a lot, Patrick. We appreciate your help.
>> 
>> Qi
>> 
>> 
>> 
>>> On Apr 18, 2021, at 11:30 PM, Patrick Sanan >> <mailto:patrick.sa...@gmail.com>> wrote:
>>> 
>>> We have this functionality in a branch, which I'm working on cleaning up to 
>>> get to master. It doesn't use PETScSection. Sorry about the delay!
>>> 
>>> You can only use PCFieldSplitSetDetectSaddlePoint when your diagonal 
>>> entries being zero or non-zero defines the splits correctly. 
>>> 
>>>> Am 17.04.2021 um 21:09 schrieb Matthew Knepley >>> <mailto:knep...@gmail.com>>:
>>>> 
>>>> On Fri, Apr 16, 2021 at 8:39 PM Jorti, Zakariae via petsc-users 
>>>> mailto:petsc-users@mcs.anl.gov>> wrote:
>>>> Hello,
>>>> 
>>>> 
>>>> 
>>>> I have a DMStag grid with one dof on each edge and face center. 
>>>> 
>>>> I want to use a PCFieldSplit preconditioner on a Jacobian matrix that I 
>>>> assume is already split but I am not sure how to determine the fields. 
>>>> 
>>>> In the DMStag examples (ex2.c and ex3.c), the function 
>>>> PCFieldSplitSetDetectSaddlePoint is used to determine those fields based 
>>>> on zero diagonal entries. In my case, I have a Jacobian matrix that does 
>>>> not have zero diagonal entries. 
>>>> 
>>>> Can I use that PCFieldSplitSetDetectSaddlePoint in this case? 
>>>> 
>>>> If not, how should I do? 
>>>> 
>>>> Should I do like this example 
>>>> (https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html
>>>>  
>>>> <https://urldefense.com/v3/__https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html__;!!HXCxUKc!jbBwV2h9luOW4dtBcNh6n_W1ULQnSVeXpxl0Ef1752s4Hlef-nC2JcnksFSO3Q$>):
>>>>  
>>>> 
>>>> const PetscInt Bfields[1] = {0},Efields[1] = {1};
>>>> 
>>>> KSPGetPC(ksp,);
>>>> 
>>>> PCFieldSplitSetBlockSize(pc,2);
>>>> 
>>>> PCFieldSplitSetFields(pc,"B",1,Bfields,Bfields); 
>>>> PCFieldSplitSetFields(pc,"E",1,Efields,Efields); 
>>>> 
>>>> where my B unknowns are defined on face centers and E unknowns are defined 
>>>> on edge centers?
>>>> 
>>>> That will not work.That interface only works for colocated fields that you 
>>>> get from DMDA.
>>>> 
>>>> Patrick, does DMSTAG use PetscSection? Then the field split would be 
>>>> automatically calculated. If not, does it maintain the
>>>> field division so that it could be given to PCFIELDSPLIT as ISes?
>>>> 
>>>>   Thanks,
>>>> 
>>>>  Matt
>>>> One last thing, I do not know which field comes first. Is it the one 
>>>> defined for face dofs or edge dofs.
>>>> 
>>>> 
>>>> 
>>>> Thank you.
>>>> 
>>>> Best regards,
>>>> 
>>>> 
>>>> 
>>>> Zakariae
>>>> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> What most experimenters take for granted before they begin their 
>>>> experiments is infinitely more interesting than any results to which their 
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>> 
>>>> https://www.cse.buffalo.edu/~knepley/ 
>>>> <https://urldefense.com/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!HXCxUKc!jbBwV2h9luOW4dtBcNh6n_W1ULQnSVeXpxl0Ef1752s4Hlef-nC2JcmGgSwfag$>
>> 
>

Re: [petsc-users] Question about PCFieldSplit

2021-04-18 Thread Patrick Sanan

We have this functionality in a branch, which I'm working on cleaning up to get 
to master. It doesn't use PETScSection. Sorry about the delay!

You can only use PCFieldSplitSetDetectSaddlePoint when your diagonal entries 
being zero or non-zero defines the splits correctly. 

> Am 17.04.2021 um 21:09 schrieb Matthew Knepley :
> 
> On Fri, Apr 16, 2021 at 8:39 PM Jorti, Zakariae via petsc-users 
> mailto:petsc-users@mcs.anl.gov>> wrote:
> Hello,
> 
> 
> 
> I have a DMStag grid with one dof on each edge and face center. 
> 
> I want to use a PCFieldSplit preconditioner on a Jacobian matrix that I 
> assume is already split but I am not sure how to determine the fields. 
> 
> In the DMStag examples (ex2.c and ex3.c), the function 
> PCFieldSplitSetDetectSaddlePoint is used to determine those fields based on 
> zero diagonal entries. In my case, I have a Jacobian matrix that does not 
> have zero diagonal entries. 
> 
> Can I use that PCFieldSplitSetDetectSaddlePoint in this case? 
> 
> If not, how should I do? 
> 
> Should I do like this example 
> (https://www.mcs.anl.gov/petsc/petsc-master/src/ksp/ksp/tutorials/ex43.c.html 
> ):
>  
> 
> const PetscInt Bfields[1] = {0},Efields[1] = {1};
> 
> KSPGetPC(ksp,);
> 
> PCFieldSplitSetBlockSize(pc,2);
> 
> PCFieldSplitSetFields(pc,"B",1,Bfields,Bfields); 
> PCFieldSplitSetFields(pc,"E",1,Efields,Efields); 
> 
> where my B unknowns are defined on face centers and E unknowns are defined on 
> edge centers?
> 
> That will not work.That interface only works for colocated fields that you 
> get from DMDA.
> 
> Patrick, does DMSTAG use PetscSection? Then the field split would be 
> automatically calculated. If not, does it maintain the
> field division so that it could be given to PCFIELDSPLIT as ISes?
> 
>   Thanks,
> 
>  Matt
> One last thing, I do not know which field comes first. Is it the one defined 
> for face dofs or edge dofs.
> 
> 
> 
> Thank you.
> 
> Best regards,
> 
> 
> 
> Zakariae
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/

Re: [petsc-users] Question about periodic conditions

2021-03-23 Thread Patrick Sanan

Hi Zakariae - sorry about the delay - responses inline below.

I'd be curious to see your code (which you can send directly to me if you don't 
want to post it publicly), so I can give you more comments, as DMStag is a new 
component. 


> Am 23.03.2021 um 00:54 schrieb Jorti, Zakariae :
> 
> Hi, 
> 
> I implemented a PETSc code to solve Maxwell's equations for the magnetic and 
> electric fields (B and E) in a cylinder: 
> 0 < r_min <= r <= r_max;with r_max > r_min
> phi_min = 0 <= r <= phi_max  = 2 π
> z_min <= z =< z_max;with z_max > z_min.
> 
> I am using a PETSc staggered grid with the electric field E defined on edge 
> centers and the magnetic field B defined on face centers. (dof0 = 0, dof1 = 
> 1,dof2 = 1, dof3 = 0;).  
> 
> I have two versions of my code: 
> 1 - A first version in which I set the boundary type to DM_BOUNDARY_NONE in 
> the three directions r, phi and z
> 2- A second version in which I set the boundary type to DM_BOUNDARY_NONE in 
> the r and z directions, and DM_BOUNDARY_PERIODIC in the phi direction. 
> 
> When I print the solution vector X, which contains both E and B components, I 
> notice that the vector is shorter with the second version compared to the 
> first one. 
> Is it normal?  
Yes - with the periodic boundary conditions, there will be fewer points since 
there won't be the "extra" layer of faces and edges at phi  = 2 * pi .

If you consider a 1-d example with 1 dof on vertices and cells, with three 
elements, the periodic case looks like this, globally,

x  x  x 

as opposed to the non-periodic case,

   x  x  x  x


> 
> Besides, I was wondering if I have to change the way I define the value of 
> the solution on the boundary. What I am doing so far in both versions is 
> something like:
> B_phi [phi = 0] = 1.0;
> B_phi [phi = 2π] = 1.0;
> E_z [r, phi = 0] = 1/r;
> E_z [r, phi = 2π] = 1/r;
> 
> Assuming that values at phi = 0 should be the same as at phi=2π with the 
> periodic boundary conditions, is it sufficient for example to have only the 
> following boundary conditions:
> B_phi [phi = 0] = 1.0;
> E_z [r, phi = 0] = 1/r ? 

Yes - this is the intention, since the boundary at phi = 2 * pi is represented 
by the same entries in the global vector. 

Of course, you need to make sure that your continuous problem is well-posed, 
which in general could change when using different boundary conditions. 

> Thank you.
> Best regards,
> 
> Zakariae Jorti

Re: [petsc-users] OOM error while using TSSUNDIALS in PETSc

2021-03-17 Thread Patrick Sanan





> Am 17.03.2021 um 15:15 schrieb Sanjoy Kumar Mazumder :
> 
> Hi all,
> 
> I am trying to solve a set of coupled stiff ODEs in parallel using TSSUNDIALS 
> with SUNDIALS_BDF as 'TSSundialsSetType' in PETSc. I am using a sparse 
> Jacobian matrix of type MATMPIAIJ with no preconditioner. It runs for a long 
> time with a very small timestep (~10^-8 - 10^-10) and then terminates 
> abruptly with the following error:
> 
> 'slurmstepd: error: Detected 4 oom-kill event(s) in step 1701844.batch 
> cgroup. Some of your processes may have been killed by the cgroup 
> out-of-memory handler.'
> 
This general class of problem can arise if there is a (small) memory leak 
occuring at every time step, so that is the first thing to rule out. 

> After going through some of the common suggestions in the mailing list 
> before, 
> 
> 1) I tried increasing the memory alloted per cpu (--mem-per-cpu) in my batch 
> script but the problem still remains. 
When you tried increasing the memory allocated per CPU, did the solver take 
more timesteps before the OOM error?

> 2) I have also checked for proper deallocation of the arrays in my function 
> and jacobian sub-routines before every TS iteration.
Did you confirm this with a tool like valgrind? If not, Is it possible for you 
to run a few time steps of your code on a local machine with valgrind?

> 3) The time allotted for my job in the assigned nodes (wall-time) far exceed 
> the time for which the job is actually running.
> 
> Is there anything I am missing out or not doing properly? Given below is the 
> complete error that is showing up after the termination.
> 
> Thanks
> 
> With regards,
> Sanjoy
> 
> --
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code. Per user-direction, the job has been aborted.
> --
> --
> MPI_ABORT was invoked on rank 11 in communicator MPI_COMM_WORLD
> with errorcode 50176059.
> 
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --
> [1]PETSC ERROR: 
> 
> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch 
> system) has told this process to end
> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [1]PETSC ERROR: or see 
> https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind 
> 
> [1]PETSC ERROR: or try http://valgrind.org  on 
> GNU/linux and Apple Mac OS X to find memory corruption errors
> [1]PETSC ERROR: likely location of problem given in stack below
> [1]PETSC ERROR: -  Stack Frames 
> 
> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [1]PETSC ERROR:   INSTEAD the line number of the start of the function
> [1]PETSC ERROR:   is given.
> [1]PETSC ERROR: [1] TSStep_Sundials line 121 
> /home/mazumder/petsc-3.14.5/src/ts/impls/implicit/sundials/sundials.c
> [1]PETSC ERROR: [1] TSStep line 3736 
> /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c
> [1]PETSC ERROR: [1] TSSolve line 4046 
> /home/mazumder/petsc-3.14.5/src/ts/interface/ts.c
> [1]PETSC ERROR: - Error Message 
> --
> [1]PETSC ERROR: Signal received
> [1]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html 
>  for trouble shooting.
> [1]PETSC ERROR: Petsc Release Version 3.14.5, Mar 03, 2021
> [1]PETSC ERROR: ./ThO2 on a arch-linux-c-debug named 
> bell-a017.rcac.purdue.edu  by mazumder Mon 
> Mar 15 13:26:36 2021
> [1]PETSC ERROR: Configure options --with-cc-mpicc --with-cxx=mpicxx 
> --with-fc=mpif90 --download-fblaslapack --download-sundials=yes 
> --with-debugging
> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> [2]PETSC ERROR: 
> 
> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch 
> system) has told this process to end
> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [2]PETSC ERROR: or see 
> https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind 
> 
> [2]PETSC ERROR: or try http://valgrind.org  on 
> GNU/linux and Apple Mac OS X to find memory corruption errors
> [2]PETSC ERROR: likely location of problem

Re: [petsc-users] [petsc-dev] headsup: switch git default branch from 'master' to 'main'

2021-02-26 Thread Patrick Sanan

The answers to these were probably stated already, but the reminder might be 
useful to others, as well.

What will happen to "master" after today? Will it be deleted immediately or at 
some planned time? If not immediately deleted, will it be updated to match main?

> Am 23.02.2021 um 18:19 schrieb Satish Balay via petsc-dev 
> :
> 
> All,
> 
> This is a heads-up, we are to switch the default branch in petsc git
> repo from 'master' to 'main'
> 
> [Will plan to do the switch on friday the 26th]
> 
> We've previously switched 'maint' branch to 'release' before 3.14
> release - and this change (to 'main') is the next step in this direction.
> 
> Satish
>

Re: [petsc-users] Slower performance in multi-node system

2021-02-04 Thread Patrick Sanan

That page has been ported to Sphinx in master (thanks, Jacob!), so if adding 
that link, it'd be helpful to do it here (Note you can click on the "edit on 
Gitlab" in the tab in the bottom right and make an MR, which is handy for 
little changes which you expect to get right in one attempt)
https://docs.petsc.org/en/master/faq/#any-useful-books-on-numerical-computing 


(And Sphinx has a utility to check all external links, so we should be able to 
clean up this and any other dead links in one pass before the next release)

> Am 04.02.2021 um 17:07 schrieb Victor Eijkhout :
> 
> 
> 
>> On , 2021Feb3, at 22:37, Barry Smith > > wrote:
>> 
>> 
>>   https://www.mcs.anl.gov/petsc/documentation/faq.html#computers 
>> 
> I happened to scroll up a line, and
> 
> Any useful books on numerical computing? <>Writing Scientific Software: A 
> Guide to Good Style 
> 
> 
> Is a dead link. 
> 
> Feel free to link to my 3 textbooks:
> 
> https://pages.tacc.utexas.edu/~eijkhout/istc/istc.html 
> 
> 
> Victor.
>

Re: [petsc-users] SNES-norm is zero all the time

2021-02-03 Thread Patrick Sanan

Are you working from a particular example, or writing your own code from 
scratch?

> Am 03.02.2021 um 18:11 schrieb Stefano Zampini :
> 
> Cannot say anything if you don't provide a minimal code we can run to 
> reproduce the issue
> 
> Il giorno mer 3 feb 2021 alle ore 20:07 Sepideh Kavousi  > ha scritto:
> Hello,
> I have a very stupid problem that I am really ashamed of asking but it has 
> been with me for days and I do not know what to do. I want to solve the 
> Javier stokes equation with finite difference method. When I wanted to run 
> the code, the snes norm value does not change and i get an error for timestep 
> convergence. I thought I might do something wrong so,  I tried to simplify 
> the equation I want to solve to an easy form, given as:
> u_t=u_x +u_y
> Where _t is the time derivative and _x and _y are the derivative in x and y 
> direction. When I want to solve this problem, it still does not do anything 
> at all and the snes function norm is zero all the time. I know I am missing 
> something but does anyone have any idea what should I check in my code.  The 
> answers does not change with time.
> Best,
> Sepideh
> 
> 
> Get Outlook for iOS 
> 
> -- 
> Stefano

Re: [petsc-users] How to confirm the performance of asynchronous computations

2021-01-28 Thread Patrick Sanan



> Am 27.01.2021 um 17:30 schrieb Matthew Knepley :
> 
> On Wed, Jan 27, 2021 at 10:51 AM Viet H.Q.H.  <mailto:hqhv...@tohoku.ac.jp>> wrote:
> Dear Patrick and Matthew,
> 
> Thank you very much for your answers.
> My test code is just the code that contains the inner product calculation and 
> the calculation of the matrix-vector multiplication. It is easy to measure 
> the time of each calculation. I'm going to revise the code to measure the 
> time as it's done in Patrick's code. I will also switch to Mpich so as not to 
> worry about the MPI environment settings.
> 
> With the current number of nodes (8), the reduction time for calculating the 
> inner product is relatively small, but I think it would get bigger if I 
> increased the number of nodes to 100 or more. In this case, the latency of 
> the inner product calculation would increase, since the reduced values are 
> transmitted via a complex network structure with cables and switches. Then, 
> hopefully, the non-blocking communication can show its effectiveness.
> 
> This is very important to do _first_. It would probably only take you a day 
> to measure the Allreduce time on your target, say the whole machine you run 
> on.
> 
Seconded! A very simple performance model is to expect reductions to take C * 
log(P) time on P ranks, which is of course a slowly increasing function.


>   Thanks
> 
>  Matt
>  
> Best,
> Viet
> 
> 
> On Wed, Jan 27, 2021 at 2:53 AM Patrick Sanan  <mailto:patrick.sa...@gmail.com>> wrote:
> 
> 
>> Am 26.01.2021 um 12:01 schrieb Matthew Knepley > <mailto:knep...@gmail.com>>:
>> 
>> On Mon, Jan 25, 2021 at 11:31 PM Viet H.Q.H. > <mailto:hqhv...@tohoku.ac.jp>> wrote:
>> Dear Patrick Sanan,
>> 
>> Thank you very much for your answer, especially for your code.
>> I was able to compile and run your code on 8 nodes with 20 processes per 
>> node. Below is the result
>> 
>> Testing with 160 MPI ranks
>> reducing an array of size 32 (256 bytes)
>> Running 5 burnin runs and 100 tests ...  Done.
>> For 100 runs with 5 burnin runs, on 160 MPI processes, min/max times over 
>> all ranks:
>> MPI timer resolution: 1.e-06 seconds
>> MPI timer resolution/#trials: 1.e-08 seconds
>> B.   Red. Only (min/max):8.850098e-06 / 8.890629e-06 seconds
>> N.B. Red. Only (min/max):1.725912e-05 / 1.733065e-05 seconds
>> Loc. Only (min/max): 2.364278e-04 / 2.374697e-04 seconds
>> Blocking (min/max):  2.650309e-04 / 2.650595e-04 seconds
>> Non-Blocking (min/max):  2.673984e-04 / 2.674508e-04 seconds
>> Observe to see if the local time is enough to hide the reduction, and see if 
>> the reduction is indeed hidden
>> 
>> It appears that the non-blocking computation with this test is no faster 
>> than the blocking computation.
>> I think I am missing some suitable Intel MPI environment settings.
>> I am now thinking about using MPICH, which does not require any environment 
>> settings for non-blocking computation.
>> Could you please let me know which MPI (MPICH or OpenMPI) you used in your 
>> tests?
> I used Cray MPI, which is based on MPICH. 
>> 
>> Note that in the test, the cost of the reduction is only 3% of the total 
>> cost. Is that worth putting time into? Is the cost definitely
>> larger in your application?
> This is very important! Note that the reductions are quite fast compared to 
> the local work. Even on a much larger number of ranks, they might still be 
> quite fast. The local work here is just some useless computation (here tuned 
> to be comparable to how long it took to do a reduction on a few thousand 
> ranks at the time). As Matt says, having some estimate of how long it takes 
> to apply your operator (matrix) and preconditioner is essential. Do you know 
> how to estimate that for your application? 
>> 
>>Thanks,
>> 
>>   Matt
>>  
>> Thanks again.
>> Viet
>> 
>> 
>> On Mon, Jan 25, 2021 at 7:47 PM Patrick Sanan > <mailto:patrick.sa...@gmail.com>> wrote:
>> Sorry about the delay in responding, but I'll add a couple of points here:
>> 
>> 
>> 1) It's important to have some reason to believe that pipelining will 
>> actually help your problem. Pipelined Krylov methods work by overlapping 
>> reductions with operator and preconditioner applications. So, to see 
>> speedup, the time for a reduction needs to be comparable to the time for the 
>> operator/preconditioner application. This will only be true in some cases - 
>> typical cases are when you have a large number of ra

Re: [petsc-users] How to confirm the performance of asynchronous computations

2021-01-26 Thread Patrick Sanan



> Am 26.01.2021 um 12:01 schrieb Matthew Knepley :
> 
> On Mon, Jan 25, 2021 at 11:31 PM Viet H.Q.H.  <mailto:hqhv...@tohoku.ac.jp>> wrote:
> Dear Patrick Sanan,
> 
> Thank you very much for your answer, especially for your code.
> I was able to compile and run your code on 8 nodes with 20 processes per 
> node. Below is the result
> 
> Testing with 160 MPI ranks
> reducing an array of size 32 (256 bytes)
> Running 5 burnin runs and 100 tests ...  Done.
> For 100 runs with 5 burnin runs, on 160 MPI processes, min/max times over all 
> ranks:
> MPI timer resolution: 1.e-06 seconds
> MPI timer resolution/#trials: 1.e-08 seconds
> B.   Red. Only (min/max):8.850098e-06 / 8.890629e-06 seconds
> N.B. Red. Only (min/max):1.725912e-05 / 1.733065e-05 seconds
> Loc. Only (min/max): 2.364278e-04 / 2.374697e-04 seconds
> Blocking (min/max):  2.650309e-04 / 2.650595e-04 seconds
> Non-Blocking (min/max):  2.673984e-04 / 2.674508e-04 seconds
> Observe to see if the local time is enough to hide the reduction, and see if 
> the reduction is indeed hidden
> 
> It appears that the non-blocking computation with this test is no faster than 
> the blocking computation.
> I think I am missing some suitable Intel MPI environment settings.
> I am now thinking about using MPICH, which does not require any environment 
> settings for non-blocking computation.
> Could you please let me know which MPI (MPICH or OpenMPI) you used in your 
> tests?
I used Cray MPI, which is based on MPICH. 
> 
> Note that in the test, the cost of the reduction is only 3% of the total 
> cost. Is that worth putting time into? Is the cost definitely
> larger in your application?
This is very important! Note that the reductions are quite fast compared to the 
local work. Even on a much larger number of ranks, they might still be quite 
fast. The local work here is just some useless computation (here tuned to be 
comparable to how long it took to do a reduction on a few thousand ranks at the 
time). As Matt says, having some estimate of how long it takes to apply your 
operator (matrix) and preconditioner is essential. Do you know how to estimate 
that for your application? 
> 
>Thanks,
> 
>   Matt
>  
> Thanks again.
> Viet
> 
> 
> On Mon, Jan 25, 2021 at 7:47 PM Patrick Sanan  <mailto:patrick.sa...@gmail.com>> wrote:
> Sorry about the delay in responding, but I'll add a couple of points here:
> 
> 
> 1) It's important to have some reason to believe that pipelining will 
> actually help your problem. Pipelined Krylov methods work by overlapping 
> reductions with operator and preconditioner applications. So, to see speedup, 
> the time for a reduction needs to be comparable to the time for the 
> operator/preconditioner application. This will only be true in some cases - 
> typical cases are when you have a large number of ranks/nodes, a slow 
> network, or very fast operator/preconditioner applications (assuming that 
> these require the same time on each rank - it's an interesting case when they 
> don't, but unless you say otherwise I'll assume this doesn't apply to your 
> use case). 
> 
> 2) As you're discovering, simply ensuring that asynchronous progress works, 
> at the pure MPI level, isn't as easy as it might be, as it's so dependent on 
> the MPI implementation.
> 
> 
> For both of these reasons, I suggest setting up a test that just directly 
> uses MPI (which you can of course do from a PETSc-style code) and allows you 
> to compare times for blocking and non-blocking reductions, overlapping some 
> (useless) local work. You should make sure to run multiple iterations within 
> the script, and also run the script multiple times on the cluster (bearing in 
> mind that it's possible that the performance will be affected by other users 
> of the system).
> 
> I attach an old script I found that I used to test some of these things, to 
> give a more concrete idea of what I mean. Note that this was used early on in 
> our own exploration of these topics so I'm only offering it to give an idea, 
> not as a meaningful benchmark in its own right.
> 
>> Am 25.01.2021 um 09:17 schrieb Viet H.Q.H. > <mailto:hqhv...@tohoku.ac.jp>>:
>> 
>> 
>> Dear Barry,
>> 
>> Thank you very much for your information.
>> 
>> It seems complicated to set environment variables to allow asynchronous 
>> progress and pinning threads to cores when using Intel MPI.
>> 
>> $ export I_MPI_ASYNC_PROGRESS = 1
>> $ export I_MPI_ASYNC_PROGRESS_PIN = 
>> 
>> https://techdecoded.intel.io/resources/hiding-communication-latency-using-mpi-3-non-blocking-collectives/
>>  
>> <ht

Re: [petsc-users] How to confirm the performance of asynchronous computations

2021-01-25 Thread Patrick Sanan

Sorry about the delay in responding, but I'll add a couple of points here:1) It's important to have some reason to believe that pipelining will actually help your problem. Pipelined Krylov methods work by overlapping reductions with operator and preconditioner applications. So, to see speedup, the time for a reduction needs to be comparable to the time for the operator/preconditioner application. This will only be true in some cases - typical cases are when you have a large number of ranks/nodes, a slow network, or very fast operator/preconditioner applications (assuming that these require the same time on each rank - it's an interesting case when they don't, but unless you say otherwise I'll assume this doesn't apply to your use case). 2) As you're discovering, simply ensuring that asynchronous progress works, at the pure MPI level, isn't as easy as it might be, as it's so dependent on the MPI implementation.For both of these reasons, I suggest setting up a test that just directly uses MPI (which you can of course do from a PETSc-style code) and allows you to compare times for blocking and non-blocking reductions, overlapping some (useless) local work. You should make sure to run multiple iterations within the script, and also run the script multiple times on the cluster (bearing in mind that it's possible that the performance will be affected by other users of the system).I attach an old script I found that I used to test some of these things, to give a more concrete idea of what I mean. Note that this was used early on in our own exploration of these topics so I'm only offering it to give an idea, not as a meaningful benchmark in its own right.

allreduce_test.c
Description: Binary data
Am 25.01.2021 um 09:17 schrieb Viet H.Q.H. :Dear Barry,Thank you very much for your information.It seems complicated to set environment variables to allow asynchronous progress and pinning threads to cores when using Intel MPI.$ export I_MPI_ASYNC_PROGRESS = 1$ export I_MPI_ASYNC_PROGRESS_PIN = https://techdecoded.intel.io/resources/hiding-communication-latency-using-mpi-3-non-blocking-collectives/I'm still not sure how to get an appropriate "CPU list" when running MPI with multiple nodes and multiple processes on one node.Best,Viet.On Sat, Jan 23, 2021 at 3:01 AM Barry Smith  wrote:https://software.intel.com/content/www/us/en/develop/documentation/mpi-developer-guide-linux/top/additional-supported-features/asynchronous-progress-control.htmlIt states "and a partial support for non-blocking collectives ( MPI_Ibcas t, MPI_Ireduce , and MPI_Iallreduce )."  I do not know what partial support means but you can try setting the variables and see if that helps.On Jan 22, 2021, at 11:20 AM, Viet H.Q.H.  wrote:Dear Victor and Berry,Thank you so much for your answers.I fixed the code with the bug in the PetscCommSplitReductionBegin function as commented by Brave.     ierr = PetscCommSplitReductionBegin (PetscObjectComm ((PetscObject) u));It was also a mistake to set the vector size too small.I just set a vector size of 1 and ran the code on 4 nodes with 2 processors per node. The result is as followsThe time used for the asynchronous calculation: 0.022043+ | u | = 1.The time used for the synchronous calculation: 0.016188+ | b | = 1.Asynchronous computation still takes a longer time.I also confirmed that PETSC_HAVE_MPI_IALLREDUCE is defined in the file $PETSC_DIR/include/petscconf.hI built Petsc by using the following script#!/usr/bin/bashset -eDATE="21.01.18"MPIIT_DIR="/work/A/intel/2018_update2/compilers_and_libraries_2018.2.199/linux/mpi/intel64"MKL_DIR="/work/A/intel/2018_update2/compilers_and_libraries_2018.2.199/linux/mkl"INSTL_DIR="${HOME}/local/petsc-3.14.3"BUILD_DIR="${HOME}/tmp/petsc/build_${DATE}"PETSC_DIR="${HOME}/tmp/petsc"cd ${PETSC_DIR}./configure --force --prefix=${INSTL_DIR} --with-mpi-dir=${MPIIT_DIR}  --with-fortran-bindings=0 --with-mpiexe=${MPIIT_DIR}/bin/mpiexec --with-valgrind-dir=${HOME}/local/valgrind --with-blaslapack-dir=${MKL_DIR} --download-make --with-debugging=0 COPTFLAGS='-O3 -march=native -mtune=native' CXXOPTFLAGS='-O3 -march=native -mtune=native' FOPTFLAGS='-O3 -march=native -mtune=native'make PETSC_DIR=${HOME}/tmp/petsc PETSC_ARCH=arch-linux2-c-opt allmake PETSC_DIR=${HOME}/tmp/petsc PETSC_ARCH=arch-linux2-c-opt install Intel 2018 also complies with the MPI-3 standard.Are there specific settings for Intel MPI to obtain the performance of the MPI_IALLREDUCE function?Sincerely,Viet.On Fri, Jan 22, 2021 at 11:20 AM Barry Smith  wrote:  ierr = VecNormBegin(u,NORM_2,);    ierr = PetscCommSplitReductionBegin(PetscObjectComm((PetscObject)Ax)); How come you call this on Ax and not on u? For clarity, if nothing else, I think you should call it on u.comb.c has /*      Split phase global vector reductions with support for combining the   communication portion of several operations. Using MPI-1.1 support only

Re: [petsc-users] PetscObjectGetComm

2020-04-22 Thread Patrick Sanan

Perhaps the confusion here is related to the fact that an MPI_Comm is not
an integer identifying the communicator. Rather,
it's a pointer to a data structure which contains information about the
communicator (I'm not positive but probably something like this

).

You're converting that pointer to an int and printing it out. The value
happens to be the same on all ranks except 0, but this
doesn't directly tell you anything about equality of the MPI_comm objects
that those pointers point to.

Am Mi., 22. Apr. 2020 um 15:28 Uhr schrieb Matthew Knepley <
knep...@gmail.com>:

> On Wed, Apr 22, 2020 at 3:07 AM Marius Buerkle  wrote:
>
>> I see, but I am still puzzeled, why are the communicators different on
>> different notes eventhough it is the same object.
>>
>
> This is the output of MPI_Comm_dup() on line 126 of tagm.c. Therefore, dup
> comms are not guaranteed to have the same id
> across multiple processes.
>
>   Thanks,
>
>  Matt
>
>
>>
>>
>> PETSc creates a duplicate of the communicator during object creation.
>>
>> https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PetscCommDuplicate.html
>>
>> Jose
>>
>>
>> > El 22 abr 2020, a las 8:40, Marius Buerkle  escribió:
>> >
>> > Hi Dave,
>> >
>> > I want to use it in Fortran if possible. But I tried both C and Fortran
>> just to see if it works in general. I am using MPICH 3.3.2. I attached the
>> MWE for C and Fortran with the output I get.
>> >
>> > Marius
>> >
>> >
>> >
>> >
>> >
>> > Hi,
>> >
>> > What is PetscObjectGetComm expected to return?
>> >
>> > As Patrick said, it returns the communicator associated with the petsc
>> object.
>> >
>> > I thought it would give the MPI communicator the object lives on. So if
>> I create A matrix on PETSC_COMM_WORLD a call of PetscObjectGetComm for A it
>> would return PETSC_COMM_WORLD? But it seems to return something else, and
>> while most of the nodes return a similar communicator some are giving a
>> different one.
>> >
>> > How are you actually comparing the communicators (send code snippet)?
>> Which MPI implementation are you using? And when are comparing comms is the
>> comparison code written in C it FORTRAN?
>> >
>> >
>> > That said, is there a way to get the MPI communicator a matrix lives on?
>> >
>> > You are using the correct function. There is a macro as well but it’s
>> best to use the function.
>> >
>> > Thanks,
>> > Dave
>> >
>> >
>> >
>> >
>> > Best,
>> > Marius
>> > 
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] PetscObjectGetComm

2020-04-21 Thread Patrick Sanan

To confirm, are you casting A to PetscObject? e.g.

 MPI_Comm comm;
 /* ... */
 PetscObjectGetComm((PetscObject)A,);


Am Mi., 22. Apr. 2020 um 07:11 Uhr schrieb Marius Buerkle :

> Hi,
>
> What is PetscObjectGetComm expected to return? I thought it would give the
> MPI communicator the object lives on. So if I create A matrix on
> PETSC_COMM_WORLD a call of PetscObjectGetComm for A it would return
> PETSC_COMM_WORLD? But it seems to return something else, and while most of
> the nodes return a similar communicator some are giving a different one.
> That said, is there a way to get the MPI communicator a matrix lives on?
>
> Best,
> Marius
>

Re: [petsc-users] Does Petsc provide the non-KSP iterative solver, e.g., SOR, for a linear system.

2020-03-18 Thread Patrick Sanan

I may not be interpreting the question correctly, but you can use SOR as
the only preconditioner, e.g.

-ksp_type gmres -pc_type sor

or you can use Richardson/SOR as your multigrid smoother, which might be
what you're asking for

 -ksp_type gmres -pc_type mg -pc_mg_levels 3
-mg_levels_ksp_type richardson -mg_levels_pc_type SOR



Am Mi., 18. März 2020 um 18:19 Uhr schrieb Xiaodong Liu :

> Hi, Petsc team,
>
> I am doing something using multigrid as a preconditioner for GMRES. From
> the official case,
>
> most use KSP (Chebyshev) and PC (SOR) for solving the preconditioning
> system for every level of multigrid except the coarse one.
>
> -ksp_type gmres -pc_type mg -pc_mg_levels 3
> -mg_levels_ksp_type Chebyshev -mg_levels_pc_type SOR
>
>  I would like to  use SOR for solving the preconditioning system directly.
> I can not find that Petsc supports non-KSP iterative solver for a linear
> system, right?
>
> If I am wrong, please let me know.
>
> Xiaodong Liu, PhD
> X: Computational Physics Division
> Los Alamos National Laboratory
> P.O. Box 1663,
> Los Alamos, NM 87544
> 505-709-0534
>

Re: [petsc-users] PETSc backward compatibility

2020-03-07 Thread Patrick Sanan

I agree that it's better to upgrade whenever possible, but if for reasons
out of your control you find yourself needing to support multiple versions
of PETSc,
which span API changes, you can use macros that PETSc provides for you, as
described here:
https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Sys/PETSC_VERSION.html

I found this handy in a similar situation to yours. There was one lingering
production cluster that had an old version of PETSc in a
very-convenient-for-users module, even though everywhere else we could use
the latest.

#if PETSC_VERSION_GE(3,9,0)
call PCFactorSetUpMatSolverType(pc,ierr);CHKERRQ(ierr);
#else
call PCFactorSetUpMatSolverPackage(pc,ierr);CHKERRQ(ierr); ! PETSc 3.8 and
earlier
#endif


Am Fr., 6. März 2020 um 19:15 Uhr schrieb Matthew Knepley :

> On Fri, Mar 6, 2020 at 12:16 PM baikadi pranay 
> wrote:
>
>> Hello PETSc users,
>>
>> We have a FORTRAN code which uses eigenvalue solver routines. The version
>> of PETSc/SLEPc used is 3.11.1.We would be deploying the code on a cluster
>> which has PETSc/SLEPc version 3.6.4. We were wondering if we need to change
>> any functions or if there is backwards compatibility.
>>
>> Please let me know if you need any further information.
>>
>
> 3.6 is 5 years old. It would be easier to just install the new version on
> the cluster. That would take 20min max.
>
> However, if you would really like to make it compatible, you can look at
> the list of changes here:
> https://www.mcs.anl.gov/petsc/documentation/changes/index.html
>
>   Thanks,
>
>  Matt
>
>
>> Thank you,
>> Pranay.
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

Re: [petsc-users] Correct approach for updating deprecated code

2020-03-03 Thread Patrick Sanan

There was a relevant change in PETSc 3.8, you now need to call DMSetUp()
after DMDACreate1d(), DMDACreate2d(), or DMDACreate3d().

https://www.mcs.anl.gov/petsc/documentation/changes/38.html
"Replace calls to DMDACreateXd() with DMDACreateXd(), [DMSetFromOptions()]
DMSetUp()"

Am Di., 3. März 2020 um 05:46 Uhr schrieb Richard Beare via petsc-users <
petsc-users@mcs.anl.gov>:

> This is the error. Maybe nothing to do with the viewer part and something
> to do with changes in initialization? Something that has happened since
> version 3.6.3, perhaps.
>
> [0]PETSC ERROR: - Error Message
> --
> [0]PETSC ERROR: Operation done in wrong order
> [0]PETSC ERROR: You should call DMSetUp() first
> [0]PETSC ERROR: See https://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.12.4-753-gbac983c101
>  GIT Date: 2020-02-18 15:05:54 +
> [0]PETSC ERROR: simul_atrophy on a  named m3j007 by rbeare Tue Mar  3
> 14:12:22 2020
> [0]PETSC ERROR: Configure options --with-cc=gcc-6 --with-cxx=g++-6
> --with-fc=gfortran --download-mpich --download-fblaslapack --with-clangu
> age=cxx --prefix=/opt/petsc/ --with-64-bit-indices=yes
> [0]PETSC ERROR: #1 DMDASetFieldName() line 68 in
> /petsc/src/dm/impls/da/dacorn.c
> [0]PETSC ERROR: #2 PetscAdLemTaras3D() line 68 in
> /simul-atrophy/src/includes/PetscAdLemTaras3D.hxx
> terminate called after throwing an instance of 'std::runtime_error'
>   what():  Error detected in C PETSc
> SIGABRT: abort
> PC=0x47282b m=0 sigcode=0
>
> goroutine 1 [running, locked to thread]:
> syscall.RawSyscall(0x3e, 0x5cbd, 0x6, 0x0, 0xc0001e3ef0, 0x48f422, 0x5cbd)
> /usr/local/go/1.11.1/src/syscall/asm_linux_amd64.s:78 +0x2b
> fp=0xc0001e3eb8 sp=0xc0001e3eb0 pc=0x47282b
> syscall.Kill(0x5cbd, 0x6, 0x4377de, 0xc0001e3f20)
> /usr/local/go/1.11.1/src/syscall/zsyscall_linux_amd64.go:597 +0x4b
> fp=0xc0001e3f00 sp=0xc0001e3eb8 pc=0x46f1db
> github.com/sylabs/singularity/internal/app/starter.Master.func4()
> internal/app/starter/master_linux.go:158 +0x3e fp=0xc0001e3f38
> sp=0xc0001e3f00 pc=0x8d51be
> github.com/sylabs/singularity/internal/pkg/util/mainthread.Execute.func1()
> internal/pkg/util/mainthread/mainthread.go:20 +0x2f
> fp=0xc0001e3f60 sp=0xc0001e3f38 pc=0x87472f
> main.main()
> cmd/starter/main_linux.go:102 +0x68 fp=0xc0001e3f98
> sp=0xc0001e3f60 pc=0x8d59f8
> runtime.main()
> /usr/local/go/1.11.1/src/runtime/proc.go:201 +0x207
> fp=0xc0001e3fe0 sp=0xc0001e3f98 pc=0x42faa7
> runtime.goexit()
> /usr/local/go/1.11.1/src/runtime/asm_amd64.s:1333 +0x1
> fp=0xc0001e3fe8 sp=0xc0001e3fe0 pc=0x45b4f1
>
> goroutine 5 [syscall]:
> os/signal.signal_recv(0xaa2620)
> /usr/local/go/1.11.1/src/runtime/sigqueue.go:139 +0x9c
> os/signal.loop()
> /usr/local/go/1.11.1/src/os/signal/signal_unix.go:23 +0x22
> created by os/signal.init.0
> /usr/local/go/1.11.1/src/os/signal/signal_unix.go:29 +0x41
>
> goroutine 7 [chan receive]:
>
> github.com/sylabs/singularity/internal/pkg/util/mainthread.Execute(0xc0003e83a0)
> internal/pkg/util/mainthread/mainthread.go:23 +0xb4
> github.com/sylabs/singularity/internal/app/starter.Master(0x4, 0xa,
> 0x2300, 0x5cca, 0xc000213e00)
> internal/app/starter/master_linux.go:157 +0x44e
> main.startup()
> cmd/starter/main_linux.go:73 +0x563
> created by main.main
> cmd/starter/main_linux.go:98 +0x3e
>
> On Tue, 25 Feb 2020 at 03:04, Matthew Knepley  wrote:
>
>> On Sun, Feb 23, 2020 at 6:45 PM Richard Beare 
>> wrote:
>>
>>> That's what I did (see below), but I got ordering errors (unfortunately
>>> deleted those logs too soon). I'll rerun if no one recognises what I've
>>> done wrong.
>>>
>>> PetscViewer viewer1;
>>> ierr = PetscViewerBinaryOpen(PETSC_COMM_WORLD,fileName.c_str
>>> (),FILE_MODE_WRITE,);CHKERRQ(ierr);
>>> //ierr =
>>> PetscViewerSetFormat(viewer1,PETSC_VIEWER_BINARY_MATLAB);CHKERRQ(ierr);
>>> ierr = PetscViewerPushFormat(viewer1,PETSC_VIEWER_BINARY_MATLAB);CHKERRQ
>>> (ierr);
>>>
>>
>> This should not cause problems. However, is it possible that somewhere
>> you are pushing a format
>> again and again without popping? This could exceed the stack size.
>>
>>   Thanks,
>>
>> Matt
>>
>>
>>> ierr = PetscObjectSetName((PetscObject)mX,"x");CHKERRQ(ierr);
>>> ierr = PetscObjectSetName((PetscObject)mB,"b");CHKERRQ(ierr);
>>>
>>> On Mon, 24 Feb 2020 at 10:43, Matthew Knepley  wrote:
>>>
 On Sun, Feb 23, 2020 at 6:25 PM Richard Beare via petsc-users <
 petsc-users@mcs.anl.gov> wrote:

>
> Hi,
> The following code gives a deprecation warning. What is the correct
> way of updating the use of ViewerSetFormat to ViewerPushFormat (which I
> presume is the preferred replacement). My first attempt gave errors
> concerning ordering.
>

 You can't just change SetFormat to PushFormat here?

Re: [petsc-users] Vertex only unstructured mesh with DMPlex and DMSWARM

2020-02-12 Thread Patrick Sanan

DMSwarm has DMSwarmMigrate() which might be the closest thing to
DMPlexDistribute().
https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/DMSWARM/DMSwarmMigrate.html
Of course, it's good to create particles in parallel when possible.

Am Mi., 12. Feb. 2020 um 12:56 Uhr schrieb Hill, Reuben <
reuben.hil...@imperial.ac.uk>:

> I'm a new Firedrake developer working on getting my head around PETSc. As
> far as I'm aware, all our PETSc calls are done via petsc4py.
>
> I'm after general help and advise on two fronts:
>
>
> *1*:
>
> I’m trying to represent a point cloud as a vertex-only mesh in an attempt
> to play nicely with the firedrake stack. If I try to do this using
> firedrake I manage to cause a PETSc seg fault error at the point of calling
>
> PETSc.DMPlex().createFromCellList(dim, cells, coords, comm=comm)
>
>
> with dim=0, cells=[[0]], coords=[[1., 2.]] and comm=COMM_WORLD.
>
> Output:
>
> [0]PETSC ERROR:
> 
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
> [0]PETSC ERROR: to get more information on the crash.
> application called MPI_Abort(MPI_COMM_WORLD, 50152059) - process 0
>
>
> I’m now looking into getting firedrake to make a DMSWARM which seems to
> have been designed for doing something closer to this and allows nice
> things such as being able to make the particles (for me - the vertices of
> the mesh) move and jump between MPI ranks a-la particle in cell. I note
> DMSwarm docks don't suggest there is an equivalent of the plex.distribute()
> method in petsc4py (which I believe calls DMPlexDistribute) that a DMPlex
> has. In firedrake we create empty DMPlexes on each rank except 0, then call
> plex.distribute() to take care of parallel partitioning. How, therefore,
> should I meant to go about distributing particles across MPI ranks?
>
>
> *2*:
>
> I'm aware these questions may be very naive. Any advice for learning about
> relevant bits of PETSc for would be very much appreciated. I'm in chapter 2
> of the excellent manual (
> https://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf) and I'm also
> attempting to understand the DMSwarm example. I presume in oder to
> understand DMSWARM I really ought to understand DMs more generally (i.e.
> read the whole manual)?
>
>
> Many thanks
> Reuben Hill
>
>
>1.
>
>

Re: [petsc-users] chowiluviennacl

2020-01-21 Thread Patrick Sanan

Just some more background on that algorithm for others reading (which is
obviously explained better in the paper, which Richard linked). As others
point out, I don't think it fits your use case.

The motivation for the Chow-Patel algorithm is the fact that traditional
ILU preconditioners don't work well in "fine-grained parallel" environments
like GPUs. "Triangularity" is something associated with lots of data
dependencies - think about Gaussian elimination  - the whole idea is
solving one equation at a time, based on the solutions of other equations.
The Chow-Patel approach is to approach things in a clever way (solving a
set of nonlinear equations describing the individual entries of the
factors) to simultaneously compute all the entries of the triangular
factors, asynchronously on lots of threads. That doesn't solve the problem
of how to solve the resulting triangular systems in parallel, though, so
that's done with an iterative approach (that is, you can approximate L^(-1)
with a polynomial in L). It's a new approach and thus should be considered
experimental. Key to note is that all of this is only explored or
implemented on a single node (shared-memory domain), so if you want to use
this preconditioner on multiple ranks it needs to be a sub-preconditioner
in a block Jacobi, ASM, or other method with "local subsolves".

Am Mo., 20. Jan. 2020 um 23:41 Uhr schrieb Mills, Richard Tran via
petsc-users :

> Hi Xiangdong,
>
> Maybe I am misunderstanding you, but it sounds like you want an exact
> direct solution, so I don't understand why you are using an incomplete
> factorization solver for this. SuperLU_DIST (as Mark has suggested) or
> MUMPS are two such packages that provide MPI-parallel sparse LU
> factorization. If you need GPU support, SuperLU_DIST has such support. I
> don't know the status of our support for using the GPU capabilities of
> this, though -- I assume another developer can chime in regarding this.
>
> Note that the ILU provided by "chowiluiennacl" employs a very different
> algorithm than the standard PCILU in PETSc, and you shouldn't expect to get
> the same incomplete factorization. The algorithm is described in this paper
> by Chow and Patel:
>
> https://www.cc.gatech.edu/~echow/pubs/parilu-sisc.pdf
>
> Best regards,
> Richard
> On 1/15/20 11:39 AM, Xiangdong wrote:
>
> I just submitted the issue: https://gitlab.com/petsc/petsc/issues/535
>
> What I really want is an exact Block Tri-diagonal solver on GPU. Since for
> block tridiagonal system, ILU0 would be the same as ILU. So I tried the
> chowiluviennacl. but I found that the default parameters does not produce
> the same ILU0 factorization as the CPU ones (PCILU). My guess is that if I
> increase the number of sweeps chow_patel_ilu_config.sweeps(3), it may give
> a better result. So the option Keys would be helpful.
>
> Since Mark mentioned the Superlu's GPU feature, can I use superlu or
> hypre's GPU functionality through PETSc?
>
> Thank you.
>
> Xiangdong
>
> On Wed, Jan 15, 2020 at 2:22 PM Matthew Knepley  wrote:
>
>> On Wed, Jan 15, 2020 at 1:48 PM Xiangdong  wrote:
>>
>>> In the ViennaCL manual
>>> http://viennacl.sourceforge.net/doc/manual-algorithms.html
>>>
>>> It did expose two parameters:
>>>
>>> // configuration of preconditioner:
>>> viennacl::linalg::chow_patel_tag chow_patel_ilu_config;
>>> chow_patel_ilu_config.sweeps(3); // three nonlinear sweeps
>>> chow_patel_ilu_config.jacobi_iters(2); // two Jacobi iterations per
>>> triangular 'solve' Rx=r
>>>
>>> and mentioned that:
>>> The number of nonlinear sweeps and Jacobi iterations need to be set
>>> problem-specific for best performance.
>>>
>>> In the PETSc' implementation:
>>>
>>> viennacl::linalg::chow_patel_tag ilu_tag;
>>> ViennaCLAIJMatrix *mat = (ViennaCLAIJMatrix*)gpustruct->mat;
>>> ilu->CHOWILUVIENNACL = new
>>> viennacl::linalg::chow_patel_ilu_precond
>>> >(*mat, ilu_tag);
>>>
>>> The default is used. Is it possible to expose these two parameters so
>>> that user can change it through option keys?
>>>
>>
>> Yes. Do you mind making an issue for it? That way we can better keep
>> track.
>>
>> https://gitlab.com/petsc/petsc/issues
>>
>>   Thanks,
>>
>> Matt
>>
>>
>>> Thank you.
>>>
>>> Xiangdong
>>>
>>> On Wed, Jan 15, 2020 at 12:40 PM Matthew Knepley 
>>> wrote:
>>>
 On Wed, Jan 15, 2020 at 9:59 AM Xiangdong  wrote:

> Maybe I am not clear. I want to solve the block tridiagonal system
> Tx=b a few times with same T but different b. On CPU, I can have it by
> applying the ILU0 and reuse the factorization. Since it is block
> tridiagonal, ILU0 would give same results as LU.
>
> I am trying to do the same thing on GPU with chowiluviennacl, but
> found default factorization does not produce the exact factorization for
> tridiagonal system. Can we tight the drop off tolerance so that it can 
> work
> as LU for tridiagonal system?
>

 There are no options in our implementation. You could

Re: [petsc-users] error handling

2020-01-21 Thread Patrick Sanan

Just to clarify: are you using PETSc within some larger application, which
you are hoping to continue executing, even after PETSc produces an error?

Am Di., 21. Jan. 2020 um 01:33 Uhr schrieb Sam Guo :

> Hi Barry,
>   I understand ierr != 0 means  something catastrophic. I just want to
> release all memory before I exit PETSc.
>
> Thanks,
> Sam
>
> On Mon, Jan 20, 2020 at 4:06 PM Smith, Barry F. 
> wrote:
>
>>
>>   Sam,
>>
>> I am not sure what your goal is but PETSc error return codes are
>> error return codes not exceptions. They mean that something catastrophic
>> happened and there is no recovery.
>>
>> Note that PETSc solvers do not return nonzero error codes on failure
>> to converge etc. You call, for example, KPSGetConvergedReason() after a KSP
>> solve to see if it has failed, this is not a catastrophic failure. If a
>> MatCreate() or any other call returns a nonzero ierr the game is up, you
>> cannot continue running PETSc.
>>
>>Barry
>>
>>
>> > On Jan 20, 2020, at 5:41 PM, Matthew Knepley  wrote:
>> >
>> > Not if you initialize the pointers to zero: Mat A = NULL.
>> >
>> >Matt
>> >
>> > On Mon, Jan 20, 2020 at 6:31 PM Sam Guo  wrote:
>> > I mean MatDestroy.
>> >
>> > On Mon, Jan 20, 2020 at 3:28 PM Sam Guo  wrote:
>> > Does it hurt to call Destroy function without calling CreateFunction?
>> For example
>> > Mat A, B;
>> > PetscErrorCode  ierr1, ierr2;
>> > ierr1 = MatCreate(PETSC_COMM_WORLD,);
>> > if(ierr1 == 0)
>> > {
>> >   ierr2 = MatCreate(PETSC_COMM_WORLD
>> > ,);
>> >
>> > }
>> > if(ierr1 !=0 || ierr2 != 0)
>> > {
>> >   Destroy();
>> >   Destroy(); // if ierr1 !=0, MatCreat is not called on B. Does it
>> hurt to call Destroy B here?
>> > }
>> >
>> >
>> >
>> > On Mon, Jan 20, 2020 at 11:11 AM Dave May 
>> wrote:
>> >
>> >
>> > On Mon 20. Jan 2020 at 19:47, Sam Guo  wrote:
>> > Can I assume if there is MatCreat or VecCreate, I should clean up the
>> memory myself?
>> >
>> > Yes. You will need to call the matching Destroy function.
>> >
>> >
>> >
>> > On Mon, Jan 20, 2020 at 10:45 AM Sam Guo  wrote:
>> > I only include the first few lines of SLEPc example. What about
>> following
>> >   ierr = MatCreate(PETSC_COMM_WORLD,);CHKERRQ(ierr);
>> >   ierr = MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,n,n);CHKERRQ(ierr);
>> > Is there any memory  lost?
>> >
>> > On Mon, Jan 20, 2020 at 10:41 AM Dave May 
>> wrote:
>> >
>> >
>> > On Mon 20. Jan 2020 at 19:39, Sam Guo  wrote:
>> > I don't have a specific case yet. Currently every call of PETSc is
>> checked. If ierr is not zero, print the error and return. For example,
>> >Mat A; /* problem matrix */
>> >EPS eps; /* eigenproblem solver context */
>> >EPSType type;
>> >   PetscReal error,tol,re,im;
>> >   PetscScalar kr,ki; Vec xr,xi; 25
>> >   PetscInt n=30,i,Istart,Iend,nev,maxit,its,nconv;
>> >   PetscErrorCode ierr;
>> >   ierr = SlepcInitialize(,,(char*)0,help);CHKERRQ(ierr);
>> >   ierr = PetscOptionsGetInt(NULL,NULL,"-n",,NULL);CHKERRQ(ierr);
>> >ierr = PetscPrintf(PETSC_COMM_WORLD,"\n1-D Laplacian Eigenproblem,
>> n=%D\n\n",n);CHKERRQ(ierr);
>> >
>> > I am wondering if the memory is lost by calling CHKERRQ.
>> >
>> > No.
>> >
>> >
>> >
>> > On Mon, Jan 20, 2020 at 10:14 AM Dave May 
>> wrote:
>> >
>> >
>> > On Mon 20. Jan 2020 at 19:11, Sam Guo  wrote:
>> > Dear PETSc dev team,
>> >If PETSc function returns an error, what's the correct way to clean
>> PETSc?
>> >
>> > The answer depends on the error message reported. Send the complete
>> error message and a better answer can be provided.
>> >
>> > Particularly how to clean up the memory?
>> >
>> > Totally depends on the objects which aren’t being freed. You need to
>> provide more information
>> >
>> > Thanks
>> > Dave
>> >
>> >
>> > Thanks,
>> > Sam
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>> > https://www.cse.buffalo.edu/~knepley/
>>
>>

Re: [petsc-users] Stopping TS in PostStep

2020-01-07 Thread Patrick Sanan

I think the standard approach is

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSetConvergedReason.html
 


with

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TS_CONVERGED_USER.html
 



> Am 07.01.2020 um 12:31 schrieb Mark Adams :
> 
> I have a test in a PostStep function in TS and I would like to terminate, 
> with success, when my test is satisfied. What is the best way to do that?
> Thanks,
> Mark

Re: [petsc-users] Outputting matrix for viewing in matlab

2019-11-29 Thread Patrick Sanan



> Am 29.11.2019 um 09:14 schrieb Patrick Sanan :
> 
> PETSc has its own binary format, which is not the same as MATLAB's. 
> 
> However, PETSc includes some MATLAB/Octave scripts which will load these 
> binary files.
> 
> See $PETSC_DIR/share/matlab/PetscBinaryRead.m - there are some examples in 
> the comments at the top of that file.
Correction: $PETSC_DIR/share/petsc/matlab/PetscBinaryRead.m 
> 
> 
> Note that you will probably want to add $PETSC_DIR/share/matlab to your 
> MATLAB path so that you can run the script. This is what I have for Octave, 
> but I'm not sure if it this, precisely, works in MATLAB:
> 
> $ cat ~/.octaverc
> PETSC_DIR=getenv('PETSC_DIR');
> if length(PETSC_DIR)==0
>   PETSC_DIR='~/code/petsc'
> end
> addpath([PETSC_DIR,'/share/petsc/matlab'])
> 
> (As an aside, note that there are also scripts included to load PETSc binary 
> files to use with numpy/scipy in  Python, e.g. 
> $PETSC_DIR/lib/petsc/bin/PetscBinaryIO.py)
> 
>> Am 29.11.2019 um 04:07 schrieb baikadi pranay > <mailto:pranayreddy...@gmail.com>>:
>> 
>> Hello PETSc users, 
>> 
>> I have a sparse matrix built and I want to output the matrix for viewing in 
>> matlab. However i'm having difficulty outputting the matrix. I am writing my 
>> program in Fortran90 and I've included the following lines to output the 
>> matrix.
>> 
>>  call 
>> PetscViewerBinaryOpen(PETSC_COMM_SELF,'matrix',FILE_MODE_WRITE,view,ierr)
>>  call PetscViewerBinaryGetDescriptor(view,fd,ierr)
>>  call PetscBinaryWrite(fd,ham,1,PETSC_SCALAR,PETSC_FALSE,ierr)
>> 
>> These lines do create a matrix but matlab says its not a binary file. Could 
>> you please provide me some inputs on where I'm going wrong and how to 
>> proceed with this problem. I can provide any further information that you 
>> might need to help me solve this problem. 
>> 
>> 
>> Thank you.
>> 
>> Sincerely,
>> Pranay. 
>

Re: [petsc-users] Outputting matrix for viewing in matlab

2019-11-29 Thread Patrick Sanan

PETSc has its own binary format, which is not the same as MATLAB's. 

However, PETSc includes some MATLAB/Octave scripts which will load these binary 
files.

See $PETSC_DIR/share/matlab/PetscBinaryRead.m - there are some examples in the 
comments at the top of that file.

Note that you will probably want to add $PETSC_DIR/share/matlab to your MATLAB 
path so that you can run the script. This is what I have for Octave, but I'm 
not sure if it this, precisely, works in MATLAB:

$ cat ~/.octaverc
PETSC_DIR=getenv('PETSC_DIR');
if length(PETSC_DIR)==0
  PETSC_DIR='~/code/petsc'
end
addpath([PETSC_DIR,'/share/petsc/matlab'])

(As an aside, note that there are also scripts included to load PETSc binary 
files to use with numpy/scipy in  Python, e.g. 
$PETSC_DIR/lib/petsc/bin/PetscBinaryIO.py)

> Am 29.11.2019 um 04:07 schrieb baikadi pranay :
> 
> Hello PETSc users, 
> 
> I have a sparse matrix built and I want to output the matrix for viewing in 
> matlab. However i'm having difficulty outputting the matrix. I am writing my 
> program in Fortran90 and I've included the following lines to output the 
> matrix.
> 
>  call 
> PetscViewerBinaryOpen(PETSC_COMM_SELF,'matrix',FILE_MODE_WRITE,view,ierr)
>  call PetscViewerBinaryGetDescriptor(view,fd,ierr)
>  call PetscBinaryWrite(fd,ham,1,PETSC_SCALAR,PETSC_FALSE,ierr)
> 
> These lines do create a matrix but matlab says its not a binary file. Could 
> you please provide me some inputs on where I'm going wrong and how to proceed 
> with this problem. I can provide any further information that you might need 
> to help me solve this problem. 
> 
> 
> Thank you.
> 
> Sincerely,
> Pranay.

Re: [petsc-users] 1. Using petsc with an existing domain decomposition. petsc-users Digest, Vol 130, Issue 43

2019-10-13 Thread Patrick Sanan via petsc-users

Hi Pierre - there's still development going on with DMStag and block PCs,  so 
if you don't find the example you're looking for, please let me know.

Best,
Patrick

> Am 13.10.2019 um 20:11 schrieb Pierre Gubernatis via petsc-users 
> :
> 
> Thanks Matt, I Hope I can find a simple example of these DMStags...
> I'll start with the linear stage of our algorithm (a Newton iteration used 
> for minimizing a residual). Next I will consider giving the whole non linear 
> problem to petsc...
> Regards.
> 
> Le dim. 13 oct. 2019 à 19:00,  > a écrit :
> Send petsc-users mailing list submissions to
> petsc-users@mcs.anl.gov 
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.mcs.anl.gov/mailman/listinfo/petsc-users 
> 
> or, via email, send a message with subject or body 'help' to
> petsc-users-requ...@mcs.anl.gov 
> 
> 
> You can reach the person managing the list at
> petsc-users-ow...@mcs.anl.gov 
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of petsc-users digest..."
> 
> 
> Today's Topics:
> 
>1.  Using petsc with an existing domain decomposition.
>   (Pierre Gubernatis)
>2. Re:  Using petsc with an existing domain decomposition.
>   (Matthew Knepley)
>3. Re:  Using petsc with an existing domain decomposition.
>   (Mark Adams)
> 
> 
> --
> 
> Message: 1
> Date: Sun, 13 Oct 2019 11:24:04 +0200
> From: Pierre Gubernatis  >
> To: petsc-users@mcs.anl.gov 
> Subject: [petsc-users] Using petsc with an existing domain
> decomposition.
> Message-ID:
>  >
> Content-Type: text/plain; charset="utf-8"
> 
> Hello all,
> 
> It souds that the best way to introduce petsc in a code is not to introduce
> it, but develop the code over the petsc structure.
> 
> It is probably true but my problem is that my existing code already is
> equipped with a domain decomposition based on MPI (a typical themal
> hydraulic with cartesian staggered mesh)
> 
> The user can slice the domain in sub-domains and construct a linear problem
> by block: each sub-domain assembles its part of the operator and its part
> of the RHS.
> 
> I am wondering what is the best way now to introduce petsc (considering
> that I don?t want to assemble a global operator on a given proc). Is there
> an example that would show how to introduce petsc in this situation ?
> Thank you, Pierre
> -- next part --
> An HTML attachment was scrubbed...
> URL: 
>   
> >
> 
> --
> 
> Message: 2
> Date: Sun, 13 Oct 2019 07:22:27 -0400
> From: Matthew Knepley mailto:knep...@gmail.com>>
> To: Pierre Gubernatis  >
> Cc: PETSc mailto:petsc-users@mcs.anl.gov>>
> Subject: Re: [petsc-users] Using petsc with an existing domain
> decomposition.
> Message-ID:
>  >
> Content-Type: text/plain; charset="utf-8"
> 
> Without having seen your code, it sounds to me like the best strategy here
> is to:
> 
>   1) Produce a mirror of your mesh using DMStag
> 
>   2) Use that DM to construct the linear block problems, which can then be
> solved by PETSc
> 
>   Since the PETSc grid matches your own, you can share the solution
> vectors between your code and PETSc,
>   so it will fit nicely into your existing structure.
> 
> Once that works, if the entire mesh is Cartesian, you could use DMStag to
> model that and let PETSc handle parallel
> decomposition. At the same stage, you could decide to let PETSc SNES handle
> the nonlinear problem, rather than
> just the linear parts.
> 
>   Thanks,
> 
> Matt
> 
> On Sun, Oct 13, 2019 at 5:25 AM Pierre Gubernatis via petsc-users <
> petsc-users@mcs.anl.gov > wrote:
> 
> > Hello all,
> >
> > It souds that the best way to introduce petsc in a code is not to
> > introduce it, but develop the code over the petsc structure.
> >
> > It is probably true but my problem is that my existing code already is
> > equipped with a domain decomposition based on MPI (a typical themal
> > hydraulic with cartesian staggered mesh)
> >
> > The user can slice the domain in sub-domains and construct a linear
> > problem by block: each sub-domain assembles its part of the operator and
> > its part of the RHS.
> >
> > I am wondering what is the best way now to

Re: [petsc-users] PCMG: level dependent smoothing steps

2019-02-19 Thread Patrick Sanan via petsc-users

Will it work for you to use

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMGGetSmoother.html

to pull out the KSP for the level you're interested in (Note link to
PCMGGetSmootherDown() if you want to control up and down smoothers
separately) and then use

https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPSetTolerances.html

to set the maximum iterations?

Am Di., 19. Feb. 2019 um 16:07 Uhr schrieb Pietro Benedusi via petsc-users <
petsc-users@mcs.anl.gov>:

> Dear PETSc team,
>
> Is there a way to set the number of smoothing steps done inside a specific
> multigrid level l ?
> Something like:
>
> PCMGSetNumberSmoothDown 
> (PC
>   
> pc,PetscInt 
> 
>  n, PetscInt 
> 
>  l)
>
>
> In this way I could, for example set 3 smoothing steps on level 1 and 7
> smoothing steps on level 2.
>
> Thank you!
> Pietro
>

Re: [petsc-users] DMDA with dimension of size 1

2018-10-01 Thread Patrick Sanan

Whoops, sent that patch too fast (forgot to update SETERRQ3 to SETERRQ4).
Updated patch for maint attached.

Am Mo., 1. Okt. 2018 um 16:55 Uhr schrieb Patrick Sanan <
patrick.sa...@gmail.com>:

> Meshes of size 1 should work.
>
> Looks like there is a bug in the code to produce this error message (patch
> for maint attached). It's not outputting the "P" (size in the 3rd
> dimension) component.
>
> This is just speculation without a full error message or code to
> reproduce, but perhaps the size in the third dimension is an uninitialized
> value which is triggering this warning with a 10 x 10 x (huge garbage) x 1
> (dof) mesh.
>
> Am Mo., 1. Okt. 2018 um 15:59 Uhr schrieb Phil Tooley <
> phil.too...@sheffield.ac.uk>:
>
>> Hi all,
>>
>> Is it valid to have a DMDA with one of the dimensions of size 1.  I was
>> hoping to avoid having to write explicit logic to handle the case that I
>> am working on a 2D rather than a 3D image for my current application.
>> Unfortunately when I try to construct such a DMDA I get an error:
>>
>> [0]PETSC ERROR: - Error Message
>> --
>> [0]PETSC ERROR: Overflow in integer operation:
>> [0]PETSC ERROR: Mesh of 10 by 10 by 1 (dof) is too large for 32 bit
>> indices
>>
>> Is there a workaround other than "write everything twice"?
>>
>> Thanks
>>
>> Phil
>>
>> --
>> Phil Tooley
>> Research Software Engineering
>> University of Sheffield
>>
>>


0001-DMSetUp_DA_3D-fix-warning-message-for-int32-mesh-siz.patch
Description: Binary data

Re: [petsc-users] DMDA with dimension of size 1

2018-10-01 Thread Patrick Sanan

Meshes of size 1 should work.

Looks like there is a bug in the code to produce this error message (patch
for maint attached). It's not outputting the "P" (size in the 3rd
dimension) component.

This is just speculation without a full error message or code to reproduce,
but perhaps the size in the third dimension is an uninitialized value which
is triggering this warning with a 10 x 10 x (huge garbage) x 1 (dof) mesh.

Am Mo., 1. Okt. 2018 um 15:59 Uhr schrieb Phil Tooley <
phil.too...@sheffield.ac.uk>:

> Hi all,
>
> Is it valid to have a DMDA with one of the dimensions of size 1.  I was
> hoping to avoid having to write explicit logic to handle the case that I
> am working on a 2D rather than a 3D image for my current application.
> Unfortunately when I try to construct such a DMDA I get an error:
>
> [0]PETSC ERROR: - Error Message
> --
> [0]PETSC ERROR: Overflow in integer operation:
> [0]PETSC ERROR: Mesh of 10 by 10 by 1 (dof) is too large for 32 bit indices
>
> Is there a workaround other than "write everything twice"?
>
> Thanks
>
> Phil
>
> --
> Phil Tooley
> Research Software Engineering
> University of Sheffield
>
>


0001-DMSetUp_DA_3D-fix-warning-message-for-int32-mesh-siz.patch
Description: Binary data

Re: [petsc-users] Performance of Fieldsplit PC

2017-11-07 Thread Patrick Sanan

>From what you're describing, it sounds like the solver you're using is
GMRES (if you are using the default), preconditioned with fieldsplit with
nested CG-Jacobi solves. That is, your preconditioner involves inner CG
solves on each field, so is a much "heavier". This seems consistent with
your observation of fewer (outer) Krylov iterations but much more work
being done per iteration.

This should all be visible with -ksp_view.

Do you see what you expect if, instead of CG/Jacobi on each block, you use
Preonly/Jacobi on each block?

On Tue, Nov 7, 2017 at 2:43 PM, Bernardo Rocha <
bernardomartinsro...@gmail.com> wrote:

> Hello everyone,
>
> I have a general question about the performance of the PCFieldSplit
> that I'm not sure if I understood properly.
>
> Consider a simple Poisson problem discretized by FEM into a system Ax=b
> which is then solved by CG and Jacobi.
>
> Then, I create a "vectorial Poisson" problem by simply adding another block
> of this problem to create a block-like version of it.
> Something like
> [ [A, 0]
>   [0, A]]
> then I create a PCFieldSplit with CG and Jacobi for each block.
>
> Either with additive or multiplicative fieldsplit, the PC is much better
> and solves it
> with fewer iterations than the scalar case. However, the execution time
> taken by
> the PCFieldSplit is much bigger than the simple Jacobi for the scalar case.
>
> (From -log_view I see all the time difference in PCApply)
>
> Why is this happening?
>
> Best regards,
> Bernardo
>
>

Re: [petsc-users] Manual mistaken about VecSetValues

2017-04-11 Thread Patrick Sanan

The path is important, as there are many examples named "ex2.c". The
example on pp. 32-35 is
${PETSC_DIR}/src/ksp/ksp/examples/tutorials/ex2.c , which is about
KSP, not Vec.



2017-04-11 9:53 GMT+02:00 Joachim Wuttke :
> The PETSc User Manual [ANL-95/11 Rev 3.7 ] says on p. 44:
>Example usage of VecSetValues() may be found in
>${PETSC_DIR}/src/vec/vec/examples/tutorials/ex2.c
>
> However, that example [pp. 32-35 in the User Manual]
> does not contain "VecSetValues".
>

Re: [petsc-users] CG with right preconditioning supports NONE norm type only

2017-03-18 Thread Patrick Sanan

There is indeed a way to interpret preconditioned CG as left preconditioning:

Consider

Ax = b

where A is spd.

You can write down the equivalent left-preconditioned system

   M^{-1}Ax = M^{-1}b

but of course this doesn't work directly with CG because M^{-1}A is no
longer spd in general.

However, if M^{-1} is spd, so is M and you can use it to define an
inner product. In this inner product, M{-1}A *is* spd. Here assume
real scalars for simplicity:

  <M^{-1}Au,v>_M = <v,M^{-1}Au>_M = <v,Au> = <Av,u> = <u,Av> = <u,M^{-1}A>_M

So, you can write use CG, using this inner product everywhere. You
don't in fact ever need to apply M, just M^{-1}, and you arrive at
standard preconditioned CG (See these notes [1]).

This is what's in PETSc as left-preconditioned CG, the only option.

This is a bit different from the way left preconditioning works for
(say) GMRES. With GMRES you could assemble A' = M^{-1}A and b' =
M^{-1}b and pass A' and b' to black-box (unpreconditioned) GMRES. With
CG, you fundamentally need to provide both M^{-1} and A (and
fortunately PETSc lets you do exactly this).

This then invites the natural question about applying the same logic
to the right preconditioned system. I saw this as an exercise in the
same course notes as above, from J. M. Melenk at TU Wien [1] .

Consider the right-preconditioned system

AM^{-1}y = b,x = M^{-1}y .

You can note that AM^-1 is spd in the M^{-1} inner product, and again
write down CG for the preconditioned system, now using M^{-1} inner
products everywhere.

You can then (as in this photo which I'm too lazy to transcribe [2]),
introduce transformed variables z = M^{-1}r, q = M^{-1}p, x = M^{-1}y
and work things out and arrive at ... standard preconditioned CG!

Thus the conclusion is that there really is only one way to
precondition CG. You can derive it as
1. left-preconditioned CG in the M norm, or
2. right-preconditioned CG in the M^{-1} norm, or
3. symmetrically-preconditioned CG (with M^{-1/2} which you don't need
to explicitly form), or
4. standard CG with a new inner product.

The last option is, I think, the most satisfying and fundamental. The
recent Málek and Strakoš book [3] makes a beautiful case as to why,
and indeed Hestenes himself first wrote about this in 1956, four years
after he and Stiefel popularized CG and before the term
"preconditioning" even existed [4].

So, in terms of concrete things to do with PETSc, I stand by my
feeling that calling CG preconditioning "left preconditioning" is
somehow misleading.

Rather, I'd suggest calling the relevant value of PCSide something
like PC_SPD . This
i) reminds the user of the fact that the preconditioner has to be spd,
ii) removes the asymmetry of calling it a left preconditioner when you
could equally-well call it a right preconditioner, and
iii) highlights the fact that preconditioning is different with CG
than with methods like GMRES.

PC_LEFT could be deprecated for CG before removing support.

If this seems reasonable, I can make a PR with these changes and
updates to the man page (and should also think about what other KSPs
this applies to - probably KSPMINRES and others).

[1] http://www.asc.tuwien.ac.at/~melenk/teach/iterative/
[2] http://patricksanan.com/rpcg.jpg
[3] http://bookstore.siam.org/sl01/
[4] 
https://scholar.google.ch/scholar?q=M.+Hestenes+1956.+The+Conjugate+Gradient+Method+for+Solving+Linear+Systems

On Wed, Mar 8, 2017 at 6:57 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>
>   Patrick,
>
> Thanks, this is interesting, we should try to add material to the KSPCG 
> page to capture some of the subtleties that I admit after 28 years I still 
> don't really understand. The paper A TAXONOMY FOR CONJUGATE GRADIENT METHODS 
> (attached) has some interesting discussion (particularly page 1548 
> "Therefore, only left preconditioning need be considered: Right 
> preconditioning may be effected by incorporating it into the left 
> preconditioner and inner product." I don't know exactly what this means in 
> practical terms in respect to code. (In PETSc KSP we don't explicitly talk 
> about, or use a separate "inner product"  in the Krylov methods, we only have 
> the concept of operator and preconditioner operator.)
>
>   Remembering vaguely, perhaps incorrectly, from a real long time ago "left 
> preconditioning with a spd M is just the unpreconditioned cg in the M inner 
> product" while "right preconditioning with M is unpreconditioned cg in a M^-1 
> inner product". If this is correct then it implies (I think) that right 
> preconditioned CG would produce a different set of iterates than left 
> preconditioning and hence is "missing" in terms of completeness from PETSc 
> KSP.
>
>   Barry
>
>
>
>
>> On Mar 8, 2017, at 7:37 PM, Patrick Sanan <patrick.s

Re: [petsc-users] KSPNormType natural

2017-03-13 Thread Patrick Sanan

This is something that arises with CG and related methods.

As in the previous thread, the "left preconditioner" in CG can be
though of as an inner product B which allows you to use CG with the
Krylov spaces K_k(BA;Bb) instead of K_k(A;b), while retaining the
property that the A-norm of the error is minimal over each successive
space.

The natural residual norm is the residual norm with respect to this
inner product, ||r||_M =  , which you might end up coding up as
(b-Ax)^TB(b-Ax). Noting r = Ae, where e is the error, you could also
write this as e^TABAe, as in the notes in the KSPCG implementation.

In the standard preconditioned CG algorithm, you compute the natural
residual norm in the process, so you can monitor convergence in this
norm without computing any additional reductions/inner products/norms.

On Mon, Mar 13, 2017 at 9:16 PM, Kong, Fande  wrote:
> Hi All,
>
> What is the definition of KSPNormType natural? It is easy to understand
> none, preconditioned, and unpreconditioned, but not natural.
>
> Fande Kong,

Re: [petsc-users] CG with right preconditioning supports NONE norm type only

2017-03-08 Thread Patrick Sanan

On Wed, Mar 8, 2017 at 11:12 AM, Barry Smith  wrote:
>
>> On Mar 8, 2017, at 10:47 AM, Kong, Fande  wrote:
>>
>> Thanks Barry,
>>
>> We are using "KSPSetPCSide(ksp, pcside)" in the code.  I just tried 
>> "-ksp_pc_side right", and petsc did not error out.
>>
>> I like to understand why CG does not work with right preconditioning? 
>> Mathematically, the right preconditioning does not make sense?
>
>No, mathematically it makes sense to do it on the right. It is just that 
> the PETSc code was never written to support it on the right. One reason is 
> that CG is interesting that you can run with the true residual or the 
> preconditioned residual with left preconditioning, hence less incentive to 
> ever bother writing it to support right preconditioning. For completeness we 
> should support right as well as symmetric.

For standard CG preconditioning, which PETSc calls left
preconditioning, you use a s.p.d. preconditioner M to define an inner
product in the algorithm, and end up finding iterates x_k in K_k(MA;
Mb). That isn't quite the same as left-preconditioned GMRES, where you
apply standard GMRES to the equivalent system MAx=Mb, and also end up
finding iterates in K_k(MA,Mb). This wouldn't work for CG because MA
isn't s.p.d. in general, even if M and A are.

Standard CG preconditioning is often motivated as a clever way to do
symmetric preconditioning, E^TAEy = E^Tb, x=Ey, without ever needing E
explicitly, using only M=EE^T . y_k lives in K_k(E^TAE,E^Tb) and thus
x_k again lives in K_k(MA;Mb).

Thus, it's not clear that there is an candidate for a
right-preconditioned CG variant, as what PETSc calls "left"
preconditioning doesn't arise in the same way that it does for other
Krylov methods, namely using the standard algorithm on MAx=Mb. For
GMRES you would get a right-preconditioned variant by looking at the
transformed system AMy=b, x = My. This means that y_k lives in
K_k(AM,b), so x lives in K_k(MA,Mb), as before. For CG, AM wouldn't be
spd in general so this approach wouldn't make sense.

Another way to look at the difference in "left" preconditioning
between GMRES and CG is that

- introducing left preconditioning for GMRES alters both the Krylov
subspaces *and* the optimality condition: you go from minimizing || b
- Ax_k ||_2 over K_k(A;b) to minimizing || M (b-Ax_k) ||_2 over
K_k(MA;Mb).

- introducing "left" preconditioning for CG alters *only* the Krylov
subspaces: you always minimize || x - x_k ||_A , but change the space
from K_k(A;b) to K_k(MA;Mb).

Thus, my impression is that it's misleading to call standard CG
preconditioning "left" preconditioning in PETSc - someone might think
of GMRES and naturally ask why there is no right preconditioning.

One might define a new entry in PCSide to be used with CG and friends.
I can't think of any slam dunk suggestions yet, but something in the
genre of PC_INNERPRODUCT, PC_METRIC, PC_CG, or PC_IMPLICITSYMMETRIC,
perhaps.

>
>   Barry
>
>>
>> Fande,
>>
>> On Wed, Mar 8, 2017 at 9:33 AM, Barry Smith  wrote:
>>
>>   Please tell us how you got this output.
>>
>>   PETSc CG doesn't even implement right preconditioning. If you ask for it 
>> it should error out. CG supports no norm computation with left 
>> preconditioning.
>>
>>Barry
>>
>> > On Mar 8, 2017, at 10:26 AM, Kong, Fande  wrote:
>> >
>> > Hi All,
>> >
>> > The NONE norm type is supported only when CG is used with a right 
>> > preconditioner. Any reason for this?
>> >
>> >
>> >
>> > 0 Nonlinear |R| = 1.732051e+00
>> >   0 Linear |R| = 0.00e+00
>> >   1 Linear |R| = 0.00e+00
>> >   2 Linear |R| = 0.00e+00
>> >   3 Linear |R| = 0.00e+00
>> >   4 Linear |R| = 0.00e+00
>> >   5 Linear |R| = 0.00e+00
>> >   6 Linear |R| = 0.00e+00
>> >  1 Nonlinear |R| = 1.769225e-08
>> >   0 Linear |R| = 0.00e+00
>> >   1 Linear |R| = 0.00e+00
>> >   2 Linear |R| = 0.00e+00
>> >   3 Linear |R| = 0.00e+00
>> >   4 Linear |R| = 0.00e+00
>> >   5 Linear |R| = 0.00e+00
>> >   6 Linear |R| = 0.00e+00
>> >   7 Linear |R| = 0.00e+00
>> >   8 Linear |R| = 0.00e+00
>> >   9 Linear |R| = 0.00e+00
>> >  10 Linear |R| = 0.00e+00
>> >  2 Nonlinear |R| = 0.00e+00
>> > SNES Object: 1 MPI processes
>> >   type: newtonls
>> >   maximum iterations=50, maximum function evaluations=1
>> >   tolerances: relative=1e-08, absolute=1e-50, solution=1e-50
>> >   total number of linear solver iterations=18
>> >   total number of function evaluations=23
>> >   norm schedule ALWAYS
>> >   SNESLineSearch Object:   1 MPI processes
>> > type: bt
>> >   interpolation: cubic
>> >   alpha=1.00e-04
>> > maxstep=1.00e+08, minlambda=1.00e-12
>> > tolerances: relative=1.00e-08, absolute=1.00e-15, 
>> > lambda=1.00e-08
>> > maximum iterations=40
>> >   KSP Object:   1 MPI

Re: [petsc-users] block ILU(K) is slower than the point-wise version?

2017-03-06 Thread Patrick Sanan

On Mon, Mar 6, 2017 at 1:48 PM, Kong, Fande  wrote:
> Hi All,
>
> I am solving a nonlinear system whose Jacobian matrix has a block structure.
> More precisely, there is a mesh, and for each vertex there are 11 variables
> associated with it. I am using BAIJ.
>
> I thought block ILU(k) should be more efficient than the point-wise ILU(k).
> After some numerical experiments, I found that the block ILU(K) is much
> slower than the point-wise version.
Do you mean that it takes more iterations to converge, or that the
time per iteration is greater, or both?
>
> Any thoughts?
>
> Fande,

Re: [petsc-users] configure PETSc on Cray

2017-02-10 Thread Patrick Sanan

On Fri, Feb 10, 2017 at 7:20 PM, Sharp Stone <thron...@gmail.com> wrote:
> Thanks for the replies.
>
> The PETSc is installed on the Cray I'm using, but the version is too old
> cray-petsc/3.3.04.
> I've still got two questions:
>
> 1. how can I use the installed cray-petsc? I've used module load cray-petsc,
> but I can't find the path of petsc, which is required in the makefile.
You can use

   module show cray-petsc

to at least get some more hints as to what cray did, PETSc-wise.
>
> 2. Therefore, I decided to configure the latest version of Petsc. When i use
> ./config/configure.py --with-cc=cc --with-cxx=CC --with-fc=ftn, it give me
> the errors
> ***
>  UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log for
> details):
> ---
> C libraries cannot directly be used from Fortran
>
> In the log file, I found these:
> ***
>  UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log for
> details):
> ---
> C libraries cannot directly be used from Fortran
> ***
>   File "./config/configure.py", line 405, in petsc_configure
> framework.configure(out = sys.stdout)
>   File
> "/mnt/lustre/lus0/space/fs1036/local/PETSc/petsc-3.7.4/config/BuildSystem/config/framework.py",
> line 1090, in configure
> self.processChildren()
>   File
> "/mnt/lustre/lus0/space/fs1036/local/PETSc/petsc-3.7.4/config/BuildSystem/config/framework.py",
> line 1079, in processChildren
> self.serialEvaluation(self.childGraph)
>   File
> "/mnt/lustre/lus0/space/fs1036/local/PETSc/petsc-3.7.4/config/BuildSystem/config/framework.py",
> line 1060, in serialEvaluation
> child.configure()
>   File
> "/mnt/lustre/lus0/space/fs1036/local/PETSc/petsc-3.7.4/config/BuildSystem/config/compilers.py",
> line 1438, in configure
> self.executeTest(self.checkCLibraries)
>   File
> "/mnt/lustre/lus0/space/fs1036/local/PETSc/petsc-3.7.4/config/BuildSystem/config/base.py",
> line 126, in executeTest
> ret = test(*args,**kargs)
>   File
> "/mnt/lustre/lus0/space/fs1036/local/PETSc/petsc-3.7.4/config/BuildSystem/config/compilers.py",
> line 313, in checkCLibraries
> raise RuntimeError('C libraries cannot directly be used from Fortran')
> 
> Finishing Configure Run at Fri Feb 10 13:07:10 2017
> 
>
>
> Thank you again for your help!
>
> On Fri, Feb 10, 2017 at 1:01 PM, Satish Balay <ba...@mcs.anl.gov> wrote:
>>
>> check config/examples/arch-cray-xt6-pkgs-opt.py
>>
>> Also check if your machine already has petsc [from cray] installed
>>
>> module avail petsc
>>
>> Satish
>>
>> On Fri, 10 Feb 2017, Patrick Sanan wrote:
>>
>> > You're missing the 'g' in 'configure.py'.
>> >
>> > On Fri, Feb 10, 2017 at 6:51 PM, Sharp Stone <thron...@gmail.com> wrote:
>> > > Hi all,
>> > >
>> > > I noticed the instructions of Petsc with MPI. Now I'm using Cray. So
>> > > the
>> > > instruction:
>> > >
>> > > Vendor provided MPI might already be installed. IBM, SGI, Cray etc
>> > > provide
>> > > their own:
>> > > ./config/confiure.py --with-cc=mpcc --with-fc=mpf77
>> > >
>> > > When I use the command "./config/confiure.py --with-cc=mpcc
>> > > --with-fc=mpf77", it gives me the error "-bash: ./config/confiure.py:
>> > > No
>> > > such file or directory". But the file "confiure.py" is there in the
>> > > "config"
>> > > folder.
>> > >
>> > > What should I do?
>> > >
>> > > Thanks very much!
>> > >
>> > > --
>> > > Best regards,
>> > >
>> > > Feng
>> >
>>
>
>
>
> --
> Best regards,
>
> Feng

Re: [petsc-users] meaning of constants in classical iterations

2017-01-20 Thread Patrick Sanan

On Fri, Jan 20, 2017 at 2:13 AM, Ed Bueler  wrote:
> Dear PETSc --
>
> In my humble opinion, the answers to the following rather basic questions
> are missing from the petsc users manual.  At least in part, I am not certain
> what the answers are.
>
>
> Question 1:   What formula does the preconditioned Richardson iteration
> (-ksp_type richardson -pc_type X) satisfy and where does the scale
> (-ksp_richardson_scale alpha) go?
>
> Answer?:  In the left-preconditioned form I would guess it is
>
> (*) x_{k+1} = x_k + alpha M_L^{-1} (b - A x_k),
>
> where M_L is the matrix implied by -pc_type X.  If so this makes option
> combination
>
> -ksp_type richardson -pc_type jacobi [-ksp_richardson_scale 1]
>
> into the classical Jacobi iteration
>
> x_{k+1} = x_k + D^{-1} (b - A x_k) = D^{-1} (b - (L+U)x_k)
>
> where A = D+L+U and D is the diagonal of A.
>
> I am pretty sure this answer is right, but equation (*) should probably be
> somewhere in the manual!
It is at least on the KSPRICHARDSON man page (which I have added to my
list of things to eventually clean up), albeit without explicit
mention of the left preconditioning.
>
>
> Question 2.  What formula is the (non-symmetric) SOR satisfying, and where
> do the constants -pc_sor_omega and -pc_sor_diagonal_shift delta go?
>
> Answer?:  My understanding is first that the classical Jacobi, Gauss-Seidel,
> and SOR iterations are all regarded by PETSc developers as
> left-preconditioned Richardson iterations, so names "jacobi" and "sor"
> describe PC objects and these classical iterations are all KSP type
> "richardson".  That is, they are all instances of (*).
>
> (RANT:  There is no clear statement of this excellent philosophy in the
> users manual, and it would help a lot of students of PETSc!  While some
> references share it, reading standard literature, including relevant wiki
> pages, is not healthful on this topic.  Because the literature does not
> necessarily parallel PETSc thinking, it requires constant mental
> translation.  This is bad when it comes to setting parameters at runtime!)
This would be very good for one of the proposed introductory
tutorials, about KSP/PC, helping people translate algorithms into
PETSc options. I've added some notes on this to my todo list as well.
>
> Supposing the above philosophy, and that A = D+L+U with the usual meanings,
> then I guess SOR is
>
> (**)x_{k+1} = x_k + alpha (omega^{-1} (D+delta) + L)^{-1} (b - A x_k),
>
> In case alpha=1, omega=1, delta=0 then (**) becomes Gauss-Seidel:
>
> -ksp_type richardson -pc_type sor [-ksp_richardson_scale 1 -pc_sor_omega 1
> -pc_sor_diagonal_shift 0]
>
> Especially because it involves three user-controlled constants, equation
> (**) could usefully be stated somewhere in the manual ... supposing it is
> correct.

Would you have any objection to this sort of thing being on the man
pages (and/or in tutorials), instead of in the manual? Especially with
the current state of affairs, with the  man pages google-able and the
manual not, the man pages might be a better place for these specifics
- hopefully it's clear to readers of the manual that they can click
links to be taken to man pages (if they have web access), which can
allow for finer details without interrupting the flow of the manual
too much or making it even longer.
>
>
> Question 3.  What is the probability that a PETSc developer will address the
> above two questions by telling me that classical iterations are lame?
>
> Answer:  Not the point!  These iterations are used blockwise (e.g. ASM) and
> as smoothers in many proper PETSc applications.  Clarity is helpful!
>
>
> Thanks for the great tool, as usual!
>
> Ed
>
>
> --
> Ed Bueler
> Dept of Math and Stat and Geophysical Institute
> University of Alaska Fairbanks
> Fairbanks, AK 99775-6660
> 301C Chapman and 410D Elvey
> 907 474-7693 and 907 474-7199  (fax 907 474-5394)

Re: [petsc-users] Simultaneously compute Residual+Jacobian in SNES

2016-12-12 Thread Patrick Sanan

On Mon, Dec 12, 2016 at 6:33 PM, Jed Brown <j...@jedbrown.org> wrote:
> Patrick Sanan <patrick.sa...@gmail.com> writes:
>> Maybe a good way to proceed is to have the "both at once computation"
>> explicitly available in PETSc's Newton or other solvers that "know"
>> that they can gain efficiency by computing both at once. That is, the
>> user is always required to register a callback to form the residual,
>> and may optionally register a function to compute (only) the Jacobian
>> and/or another function to compute the combined Jacobian/Residual.
>> Newton can look for the combined function at the times when it would
>> help, and if nothing was registered would fall back to the current
>> technique of calling separate residual and Jacobian routines. This
>> would preserve the benefits of the current approach, allow
>> optimizations of the kind mentioned, and hopefully remain
>> somewhat-maintainable; in particular, this doesn't require the user to
>> be promised anything about the order in which Jacobians and residuals
>> are computed.
>
> How would the solver decide?  It requires a model for the relative costs
> of {residual, Jacobian, residual+Jacobian}.  Where would the SNES learn
> that?
>
> I'm not saying the above is wrong, but it adds complexity and wouldn't
> be useful without the cost model.  An alternative would be for the
> user's residual to be able to query the SNES's estimated probability
> that a Jacobian will be needed.  So if the SNES has observed quadratic
> convergence, it might be quite confident (unless it expects this
> residual to meet the convergence criteria).  The advantage of this
> scheme is that it applies to partial representations of the Jacobian
> (like storing the linearization at quadrature points, versus assembling
> the sparse matrix).

I did have an error in my thinking, which is that I forgot about the
fact that you typically have the residual already at the point where
the Jacobian is computed, so would would have to be using a method
(like the cartoon in my mind when I wrote the above) that didn't
involve a linesearch (as Derek also described) or as you suggest,
somehow had a guess about when the linesearch would terminate.

Re: [petsc-users] Simultaneously compute Residual+Jacobian in SNES

2016-12-12 Thread Patrick Sanan

On Mon, Dec 12, 2016 at 5:58 AM, Derek Gaston  wrote:
> A quick note: I'm not hugely invested in this idea... I'm just talking it
> out since I started it.  The issues might outweigh potential gains...
>
> On Sun, Dec 11, 2016 at 4:02 PM Matthew Knepley  wrote:
>>
>> I consider this bad code management more than an analytical case for the
>> technique, but I can see the point.
>
>
> Can you expand on that?  Do you believe automatic differentiation in general
> to be "bad code management"?
>
>>
>>- Extremely complex, heavy shape function evaluation (think super high
>> order with first and second derivatives needing to be computed)
>>
>> I honestly do not understand this one. Maybe I do not understand high
>> order since I never use it. If I want to compute an integral, I have
>> the basis functions tabulated. I understand that for high order, you use a
>> tensor product evaluation, but you still tabulate in 1D. What is
>> being recomputed here?
>
>
> In unstructured mesh you still have to compute the reference->physical map
> for each element and map all of the gradients/second derivatives to physical
> space.  This can be quite expensive if you have a lot of shape functions and
> a lot of quadrature points.
>
> Sometimes we even have to do this step twice: once for the un-deformed mesh
> and once for the deformed mesh... on every element.
>
>>
>>
>>>
>>>   - Extremely heavy material property computations
>>
>>
>> Yes. I have to think about this one more.
>>
>>>
>>>   - MANY coupled variables (we've run thousands).
>>
>>
>> Ah, so you are saying that the process of field evaluation at the
>> quadrature points is expensive because you have so many fields.
>> It feels very similar to the material case, but I cannot articulate why.
>
>
> It is similar: it's all about how much information you have to recompute at
> each quadrature point.  I was simply giving different scenarios for why you
> could end up with heavy calculations at each quadrature point that feed into
> both the Residual and Jacobian calculations.
>
>>
>> I guess my gut says that really expensive material properties,
>> much more expensive than my top level model, should be modeled by
>> something simpler at that level. Same feeling for using
>> thousands of fields.
>
>
> Even if you can find something simpler it's good to be able to solve the
> expensive one to verify your simpler model.  Sometimes the microstructure
> behavior is complicated enough that it's quite difficult to wrap up in a
> simpler model or (like you said) it won't be clear if a simpler model is
> possible without doing the more expensive model first.
>
> We really do have models that require thousands (sometimes tens of
> thousands) of coupled PDEs.  Reusing the field evaluations for both the
> residual and Jacobian could be a large win.
>
>>
>>   1) I compute a Jacobian with every residual. This sucks because line
>> search and lots of other things use residuals.
>>
>>
>>   2) I compute a residual with every Jacobian. This sound like it could
>> work because I compute both for the Newton system, but here I
>>   am reusing the residual I computed to check the convergence
>> criterion.
>>
>> Can you see a nice way to express Newton for this?
>
>
> You can see my (admittedly stupidly simple) Newton code that works this way
> here:
> https://github.com/friedmud/MOOSE.jl/blob/master/src/solvers/JuliaDenseNonlinearImplicitSolver.jl#L42
>
> Check the assembly code here to see how both are computed simultaneously:
> https://github.com/friedmud/MOOSE.jl/blob/master/src/solvers/Assembly.jl#L59
>
> Lack of line search makes it pretty simple.  However, even with this simple
> code I end up wasting one extra Jacobian evaluation once the convergence
> criteria has been reached.  Whether or not that is advantageous depends on
> the relative tradeoffs of reusable element computations vs Jacobian
> calculation and how many nonlinear iterations you do (if you're only doing
> one nonlinear iteration every timestep then you're wasting 50% of your total
> Jacobian calculation time).
>
> For a full featured solver you would definitely also want to have the
> ability to compute a residual, by itself, when you want... for things like
> line search.
>
> You guys have definitely thought a lot more about this than I have... I'm
> just spitballing here... but it does seem like having an optional interface
> for computing a combined residual/Jacobian could save some codes a
> significant amount of time.
Maybe a good way to proceed is to have the "both at once computation"
explicitly available in PETSc's Newton or other solvers that "know"
that they can gain efficiency by computing both at once. That is, the
user is always required to register a callback to form the residual,
and may optionally register a function to compute (only) the Jacobian
and/or another function to compute the combined Jacobian/Residual.
Newton can look for the combined function at

Re: [petsc-users] Looking for a quick example of a symmetric KKT system

2016-10-22 Thread Patrick Sanan

No particularly good reason - I am also using Stokes systems as tests,
but thought it might be interesting to test on a saddle point system 
arising from a different proble.
On Fri, Oct 21, 2016 at 11:50:24AM -0600, Jed Brown wrote:
> Why doesn't a Stokes problem fulfill your needs?
> 
> Patrick Sanan <patrick.sa...@gmail.com> writes:
> 
> > Yes, but AFAIK that example produces a 2x2 system - I was hoping for
> > something with a variable problem size, ideally with some sort of
> > physics motivating the underlying optimization problem.
> >
> > On Fri, Oct 21, 2016 at 7:23 PM, Justin Chang <jychan...@gmail.com> wrote:
> >> Something like this?
> >>
> >> http://www.mcs.anl.gov/petsc/petsc-current/src/tao/constrained/examples/tutorials/toy.c.html
> >>
> >>
> >> On Friday, October 21, 2016, Patrick Sanan <patrick.sa...@gmail.com> wrote:
> >>>
> >>> Are there any examples already in PETSc or TAO that assemble such a
> >>> system (which could thus be dumped)? SNES example ex73f90t assembles a
> >>> non-symmetric KKT system.




pgpzk7TYTXxpi.pgp
Description: PGP signature

Re: [petsc-users] Looking for a quick example of a symmetric KKT system

2016-10-21 Thread Patrick Sanan

Yes, but AFAIK that example produces a 2x2 system - I was hoping for
something with a variable problem size, ideally with some sort of
physics motivating the underlying optimization problem.

On Fri, Oct 21, 2016 at 7:23 PM, Justin Chang <jychan...@gmail.com> wrote:
> Something like this?
>
> http://www.mcs.anl.gov/petsc/petsc-current/src/tao/constrained/examples/tutorials/toy.c.html
>
>
> On Friday, October 21, 2016, Patrick Sanan <patrick.sa...@gmail.com> wrote:
>>
>> Are there any examples already in PETSc or TAO that assemble such a
>> system (which could thus be dumped)? SNES example ex73f90t assembles a
>> non-symmetric KKT system.

[petsc-users] Looking for a quick example of a symmetric KKT system

2016-10-21 Thread Patrick Sanan

Are there any examples already in PETSc or TAO that assemble such a
system (which could thus be dumped)? SNES example ex73f90t assembles a
non-symmetric KKT system.

Re: [petsc-users] large PetscCommDuplicate overhead

2016-10-06 Thread Patrick Sanan

Happy to (though Piz Daint goes down for an extended upgrade on Oct 17
so would need to be run before then)!

On Thu, Oct 6, 2016 at 6:03 PM, Matthew Knepley <knep...@gmail.com> wrote:
> On Thu, Oct 6, 2016 at 10:55 AM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>>
>>
>>Matt,
>>
>> Thanks for this information. It sure looks like there is something
>> seriously wrong with the MPI_Attr_get() on the cray for non-root process.
>> Does any PETSc developer have access to such a machine? We need to write a
>> test program that just calls MPI_Attr_get a bunch of times (no PETSc) to see
>> if we can reproduce the problem and report it to Cray.
>
>
> Barry, if you write it, we can give it to Patrick Sanan to run.
>
>   Thanks,
>
> Matt
>
>>
>>Barry
>>
>>
>>
>> On Oct 6, 2016, at 10:45 AM, Matthew Overholt <overh...@capesim.com>
>> wrote:
>> >
>> >
>> > Matthew and Barry,
>> >
>> > 1) I did a direct measurement of PetscCommDuplicate() time by tracing
>> > just
>> > that call (using CrayPat), and confirmed the sampling results.  For 8
>> > processes (n=8), tracing counted a total of 101 calls, taking ~0 time on
>> > the
>> > root process but taking 11.78 seconds (6.3% of 188 total seconds) on
>> > each of
>> > the other 7 processes.  For 16 processes (n=16, still only 1 node),
>> > tracing
>> > counted 102 total calls for a total of 18.42 seconds (13.2% of 139.6
>> > total
>> > seconds) on every process except the root.
>> >
>> > 2) Copied below is a section of the log view for the first two solutions
>> > for
>> > n=2, which shows the same calls as for n=8. (I can send the entire log
>> > files
>> > if desired.)  In each case I count about 44 PCD calls per process during
>> > initialization and meshing, 7 calls during setup, 9 calls for the first
>> > solution, then 3 calls for each subsequent solution (fixed-point
>> > iteration),
>> > and 3 calls to write out the solution, for 75 total.
>> >
>> > 3) I would expect that the administrators of this machine have
>> > configured
>> > PETSc appropriately. I am using their current default install, which is
>> > 3.7.2.
>> >
>> > https://www.nersc.gov/users/software/programming-libraries/math-libraries/pe
>> > tsc/
>> >
>> > 4) Yes, I just gave the MUMPS time as a comparison.
>> >
>> > 5) As to where it is spending time, perhaps the timing results in the
>> > log
>> > files will be helpful. The "Solution took ..." printouts give the total
>> > solution time for that iteration, the others are incremental times.  (As
>> > an
>> > aside, I have been wondering why the solution times do not scale well
>> > with
>> > process count, even though that work is entirely done in parallel PETSc
>> > routines.)
>> >
>> > Thanks,
>> > Matt Overholt
>> >
>> >
>> > ** -log_view -info results for n=2 : the first solution and
>> > subsequent fixed-point iteration ***
>> > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
>> > -2080374779
>> > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
>> > -2080374779
>> > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
>> > -2080374780
>> > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
>> > -2080374781
>> > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
>> > -2080374782
>> > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
>> > -2080374780
>> > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689
>> > -2080374782
>> > Matrix setup took 0.108 s
>> > [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
>> > -2080374779
>> > [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688
>> > -2080374781
>> > KSP PC setup took 0.079 s
>> > [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
>> > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
>> > [0] MatStashScatterBegin_Ref(): No of messages: 0
>> > [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
>> > [1] MatStashScatterBegin_Ref(): No of messages: 1
>> > [1] MatStashScatterBegin_Ref(): Me

Re: [petsc-users] How to broadcast a double value to all the nodes in the cluster with Petsc

2016-10-05 Thread Patrick Sanan

PETSc, by design, does not wrap any of the existing functionality of
MPI, so this would be accomplished with an MPI function like
MPI_Bcast().

On Wed, Oct 5, 2016 at 11:02 AM, 丁老师  wrote:
> Dear professor:
>How to broadcast a double value to all the nodes in the cluster with
> Petsc
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: [petsc-users] Why the code needs much longer time when I use MPI

2016-09-29 Thread Patrick Sanan

Is the number of iterations to convergence also increasing with the
number of processors?

A couple of possibly relevant FAQs:
http://www.mcs.anl.gov/petsc/documentation/faq.html#differentiterations
http://www.mcs.anl.gov/petsc/documentation/faq.html#slowerparallel

On Thu, Sep 29, 2016 at 10:56 AM, Ji Zhang  wrote:
> Dear all,
> I'm using KSP included in PETSc to solve some linear equations. The code
> runs well if I do not use mpirun. However, the running duration since
> increase linearly with the number of CUPs.
>
> The solve method is gmres and without precondition method. For a same case,
> it need about 4.9s with 'mpirun -n 1', which is about 7.9s if I use 'mpirun
> -n 2', and 12.5s when I use 'mpirun -n 4'. It looks like that I need double
> time when I use double CUPs. Is there any one could give me some suggestion?
> Thanks.
>
> Best,
> Regards,
> Zhang Ji, PhD student
> Beijing Computational Science Research Center
> Zhongguancun Software Park II, No. 10 Dongbeiwang West Road, Haidian
> District, Beijing 100193, China

Re: [petsc-users] Diagnosing a difference between "unpreconditioned" and "true" residual norms

2016-09-13 Thread Patrick Sanan

Hi Barry -

You were correct - the problem was indeed a nonlinear (in double
precision) operator. My operator in this case wrapped an existing
residual evaluation routine (by someone else). I had lazily been
computing Ax = (Ax-b)+b; that is, compute the residual and then add
the rhs back. The norm of the b was large enough to wipe out a few
digits of accuracy, thus introducing floating point noise which
rendered the operator nonlinear enough in double precision to induce
the observed deviation in the norms.

Thanks!

On Fri, Sep 9, 2016 at 9:44 PM, Barry Smith <bsm...@mcs.anl.gov> wrote:
>
>Patrick,
>
>  I have only seen this when the "linear" operator turned out to not 
> actually be linear or at least not linear in double precision.
> Are you using differencing or anything in your MatShell that might make it 
> not be a linear operator in full precision?
>
> Since your problem is so small you can compute the Jacobian explicitly via 
> finite differencing and then use that matrix plus your shell preconditioner. 
> I beat if you do this you will see the true and non-true residual norms 
> remain the same, this would likely mean something is wonky with your shell 
> matrix.
>
>   Barry
>
>> On Sep 9, 2016, at 9:32 AM, Patrick Sanan <patrick.sa...@gmail.com> wrote:
>>
>> I am debugging a linear solver which uses a custom operator and
>> preconditioner, via MATSHELL and PCSHELL. Convergence seems to be
>> fine, except that I unexpectedly see a difference between the
>> "unpreconditioned" and "true" residual norms when I use
>> -ksp_monitor_true_residual with a right-preconditioned Krylov method
>> (FGMRES or right-preconditioned GMRES).
>>
>>  0 KSP unpreconditioned resid norm 9.266794204683e+08 true resid norm
>> 9.266794204683e+08 ||r(i)||/||b|| 1.e+00
>>  1 KSP unpreconditioned resid norm 2.317801431974e+04 true resid norm
>> 2.317826550333e+04 ||r(i)||/||b|| 2.501217248530e-05
>>  2 KSP unpreconditioned resid norm 4.453270507534e+00 true resid norm
>> 2.699824780158e+01 ||r(i)||/||b|| 2.913439880638e-08
>>  3 KSP unpreconditioned resid norm 1.015490793887e-03 true resid norm
>> 2.658635801018e+01 ||r(i)||/||b|| 2.868991953738e-08
>>  4 KSP unpreconditioned resid norm 4.710220776105e-07 true resid norm
>> 2.658631616810e+01 ||r(i)||/||b|| 2.868987438467e-08
>> KSP Object:(mgk_) 1 MPI processes
>>  type: fgmres
>>GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>GMRES: happy breakdown tolerance 1e-30
>>  maximum iterations=1, initial guess is zero
>>  tolerances:  relative=1e-13, absolute=1e-50, divergence=1.
>>  right preconditioning
>>  using UNPRECONDITIONED norm type for convergence test
>> PC Object:(mgk_) 1 MPI processes
>>  type: shell
>>Shell: Custom PC
>>  linear system matrix = precond matrix:
>>  Mat Object:  Custom Operator   1 MPI processes
>>type: shell
>>rows=256, cols=256
>>  has attached null space
>>
>> I have dumped the explicit operator and preconditioned operator, and I
>> can see that the operator and the preconditioned operator each have a
>> 1-dimensional nullspace (a constant-pressure nullspace) which I have
>> accounted for by constructing a normalized, constant-pressure vector
>> and supplying it to the operator via a MatNullSpace.
>>
>> If I disregard the (numerically) zero singular value, the operator has
>> a condition number of 1.5669e+05 and the preconditioned operator has a
>> condition number of 1.01 (strong preconditioner).
>>
>> Has anyone seen this sort of behavior before and if so, is there a
>> common culprit that I am overlooking? Any ideas of what to test next
>> to try to isolate the issue?
>>
>> As I understand it, the unpreconditioned and true residual norms
>> should be identical in exact arithmetic, so I would suspect that
>> somehow I've ended up with a "bad Hessenberg matrix" in some way as I
>> perform this solve (or maybe I have a more subtle bug).
>

[petsc-users] Diagnosing a difference between "unpreconditioned" and "true" residual norms

2016-09-09 Thread Patrick Sanan

I am debugging a linear solver which uses a custom operator and
preconditioner, via MATSHELL and PCSHELL. Convergence seems to be
fine, except that I unexpectedly see a difference between the
"unpreconditioned" and "true" residual norms when I use
-ksp_monitor_true_residual with a right-preconditioned Krylov method
(FGMRES or right-preconditioned GMRES).

  0 KSP unpreconditioned resid norm 9.266794204683e+08 true resid norm
9.266794204683e+08 ||r(i)||/||b|| 1.e+00
  1 KSP unpreconditioned resid norm 2.317801431974e+04 true resid norm
2.317826550333e+04 ||r(i)||/||b|| 2.501217248530e-05
  2 KSP unpreconditioned resid norm 4.453270507534e+00 true resid norm
2.699824780158e+01 ||r(i)||/||b|| 2.913439880638e-08
  3 KSP unpreconditioned resid norm 1.015490793887e-03 true resid norm
2.658635801018e+01 ||r(i)||/||b|| 2.868991953738e-08
  4 KSP unpreconditioned resid norm 4.710220776105e-07 true resid norm
2.658631616810e+01 ||r(i)||/||b|| 2.868987438467e-08
KSP Object:(mgk_) 1 MPI processes
  type: fgmres
GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
GMRES: happy breakdown tolerance 1e-30
  maximum iterations=1, initial guess is zero
  tolerances:  relative=1e-13, absolute=1e-50, divergence=1.
  right preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object:(mgk_) 1 MPI processes
  type: shell
Shell: Custom PC
  linear system matrix = precond matrix:
  Mat Object:  Custom Operator   1 MPI processes
type: shell
rows=256, cols=256
  has attached null space

I have dumped the explicit operator and preconditioned operator, and I
can see that the operator and the preconditioned operator each have a
1-dimensional nullspace (a constant-pressure nullspace) which I have
accounted for by constructing a normalized, constant-pressure vector
and supplying it to the operator via a MatNullSpace.

If I disregard the (numerically) zero singular value, the operator has
a condition number of 1.5669e+05 and the preconditioned operator has a
condition number of 1.01 (strong preconditioner).

Has anyone seen this sort of behavior before and if so, is there a
common culprit that I am overlooking? Any ideas of what to test next
to try to isolate the issue?

As I understand it, the unpreconditioned and true residual norms
should be identical in exact arithmetic, so I would suspect that
somehow I've ended up with a "bad Hessenberg matrix" in some way as I
perform this solve (or maybe I have a more subtle bug).

Re: [petsc-users] How to use rtol as stopping criteria for PCMG subksp solver

2016-09-01 Thread Patrick Sanan

On Thu, Sep 1, 2016 at 4:02 PM, Aulisa, Eugenio  wrote:
> Hi,
>
> I have
>
> ksp GMRES->preconditioned with PCMG
>
> and at each level
> subksp GMRES -> preconditioned with different PCs
>
> I would like to control the stopping criteria of each level-subksp
> using either the relative tolerance or the npre/npost number of smoothings
>
> I set at each level
>
>  KSPSetNormType(subksp, KSP_NORM_PRECONDITIONED); // assuming left 
> preconditioner
>  KSPSetTolerances(subksp, rtol, PETSC_DEFAULT, PETSC_DEFAULT, npre);
>
> It seams (but I am not sure how to check it properly) that rtol is completely 
> uninfluential,
> rather it always uses the fix number of iterations fixed by npre.
>
> When I run with the option -ksp_view I see that the norm for the subksp
> has effectively changed  from NONE to  PRECONDITIONED
> for example with rtol=0.1 and npre=10 I get
>
> %%%
>
> Down solver (pre-smoother) on level 2 ---
> KSP Object:(level-2) 4 MPI processes
>   type: gmres
> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
> Orthogonalization with no iterative refinement
> GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10
>   tolerances:  relative=0.1, absolute=1e-50, divergence=1.
>   left preconditioning
>   using nonzero initial guess
>   using PRECONDITIONED norm type for convergence test
>
> %
>
> however when I run the code with -ksp_monitor_true_residual
> I only see the iteration info of the external ksp GMRES,
> but I do not see any iteration info relative to  the subksps.
Note that each subksp has its own prefix which you can use to control
its behavior. It looks like you set this to "level-2", so (assuming
that you can have dashes in options prefixes, which I'm not completely
certain of), you should be able to do things like

-level-2_ksp_converged_reason
-level-2_ksp_monitor_true_residual

To see what's going on with the sub solver.
>
> I assume (but I am not sure) that subksp is behaving
> as the NONE norm were set.
>
> Any idea if I set something wrong?
> How do I effectively check how many iterations subksp does?
>
> thanks,
> Eugenio
>
>
>
>
>
>
>
>
>
> Eugenio Aulisa
>
> Department of Mathematics and Statistics,
> Texas Tech University
> Lubbock TX, 79409-1042
> room: 226
> http://www.math.ttu.edu/~eaulisa/
> phone: (806) 834-6684
> fax: (806) 742-1112
>
>
>

Re: [petsc-users] What caused this problem."PetscInitialize() must be called before PetscFinalize()"

2016-08-26 Thread Patrick Sanan

Could you send the entire error message?

On Fri, Aug 26, 2016 at 9:42 AM, 丁老师  wrote:
> Dear friends:
>My code run without any problem yesterday, but it gives me the following
> error. I do not know what is the reason?
> Regards
>
> [0]PETSC ERROR: #1 PetscOptionsInsertFile() line 563 in
> /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c
> [0]PETSC ERROR: #2 PetscOptionsInsert() line 720 in
> /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/options.c
> [0]PETSC ERROR: #3 PetscInitialize() line 828 in
> /home/ztdep/Downloads/petsc-3.6.3/src/sys/objects/pinit.c
>
> PetscInitialize() must be called before PetscFinalize()
>
> Regards
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>

Re: [petsc-users] false-positive leak report in log_view?

2016-08-04 Thread Patrick Sanan

I have a patch that I got from Dave that he got from Jed which seems
to be related to this. I'll make a PR.

On Wed, Aug 3, 2016 at 8:18 PM, Mohammad Mirzadeh  wrote:
> OK so I just ran the example under valgrind, and if I use two VecViews, it
> complains about following leak:
>
> ==66838== 24,802 (544 direct, 24,258 indirect) bytes in 1 blocks are
> definitely lost in loss record 924 of 926
> ==66838==at 0x19EBB: malloc (in
> /usr/local/Cellar/valgrind/3.11.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
> ==66838==by 0x10005E638: PetscMallocAlign (in
> /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib)
> ==66838==by 0x100405F00: DMCreate_DA (in
> /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib)
> ==66838==by 0x1003CFFA4: DMSetType (in
> /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib)
> ==66838==by 0x100405B7F: DMDACreate (in
> /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib)
> ==66838==by 0x1003F825F: DMDACreate2d (in
> /usr/local/Cellar/petsc/3.7.2/real/lib/libpetsc.3.7.2.dylib)
> ==66838==by 0x11D89: main (main_test.cpp:7)
>
> By I am destroying the dm ... also I dont get this when using a single
> VecView. As a bonus info, PETSC_VIEWER_STDOUT_WORLD is just fine, so this
> looks like it is definitely vtk related.
>
> On Wed, Aug 3, 2016 at 2:05 PM, Mohammad Mirzadeh 
> wrote:
>>
>> On Wed, Aug 3, 2016 at 10:59 AM, Matthew Knepley 
>> wrote:
>>>
>>> On Tue, Aug 2, 2016 at 12:40 PM, Mohammad Mirzadeh 
>>> wrote:

 I often use the memory usage information in log_view as a way to check
 on memory leaks and so far it has worked perfect. However, I had long
 noticed a false-positive report in memory leak for Viewers, i.e. 
 destruction
 count is always one less than creation.
>>>
>>>
>>> Yes, I believe that is the Viewer being used to print this information.
>>
>>
>> That makes sense.
>>>
>>>

 Today, I noticed what seems to be a second one. If you use VecView to
 write the same DA to vtk, i.e. call VecView(A, vtk); twice, it also report 
 a
 memory leak for vectors, vecscatters, dm, etc. I am calling this a
 false-positive since the code is valgrind-clean.

 Is this known/expected?
>>>
>>>
>>> The VTK viewers have to hold everything they output until they are
>>> destroyed since the format does not allow immediate writing.
>>> I think the VTK viewer is not destroyed at the time of this output. Can
>>> you make a small example that does this?
>>
>>
>> Here's a small example that illustrates the issues
>>
>> #include 
>>
>>
>> int main(int argc, char *argv[]) {
>>
>>   PetscInitialize(, , NULL, NULL);
>>
>>
>>   DM dm;
>>
>>   DMDACreate2d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE,
>> DMDA_STENCIL_BOX,
>>
>>-10, -10, PETSC_DECIDE, PETSC_DECIDE, 1, 1, NULL, NULL,
>>
>>);
>>
>> //  DMDASetUniformCoordinates(dm, -1, 1, -1, 1, 0, 0);
>>
>>
>>   Vec sol;
>>
>>   DMCreateGlobalVector(dm, );
>>
>>   VecSet(sol, 0);
>>
>>
>>   PetscViewer vtk;
>>
>>   PetscViewerVTKOpen(PETSC_COMM_WORLD, "test.vts", FILE_MODE_WRITE, );
>>
>>   VecView(sol, vtk);
>>
>> //  VecView(sol, vtk);
>>
>>   PetscViewerDestroy();
>>
>>
>>   DMDestroy();
>>
>>   VecDestroy();
>>
>>
>>   PetscFinalize();
>>
>>   return 0;
>>
>> }
>>
>>
>> If you uncomment the second VecView you get reports for leaks in
>> VecScatter and dm. If you also uncomment the DMDASetUniformCoordinates, and
>> use both VecViews, you also get a leak report for Vecs ... its quite bizarre
>> ...
>>
>>>
>>> I have switched to HDF5 and XDMF due to the limitations of VTK format.
>>>
>>
>> I had used XDMF + raw binary in the past and was satisfied with the
>> result. Do you write a single XDMF as a "post-processing" step when the
>> simulation is finished? If I remember correctly preview could not open xmf
>> files as time-series.
>>>
>>>   Thanks,
>>>
>>>  Matt
>>>

 Here's the relevant bit from log_view:

 --- Event Stage 0: Main Stage

   Vector 8  7   250992 0.
   Vector Scatter 2  00 0.
 Distributed Mesh 2  00 0.
 Star Forest Bipartite Graph 4  00 0.
  Discrete System 2  00 0.
Index Set 4  483136 0.
IS L to G Mapping 2  00 0.
   Viewer 2  1  784 0.

 

>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> --

Re: [petsc-users] vec norm for local portion of a vector

2016-08-02 Thread Patrick Sanan

There is also VecGetLocalVector() and VecRestoreLocalVector() . You
should be able to use those to (copy-free, hopefully) obtain the local
entries of a global vector as a local vector which you could call the
usual VecNorm() on.


On Fri, Jul 29, 2016 at 4:41 PM, Barry Smith  wrote:
>
>> On Jul 27, 2016, at 4:42 PM, Xiangdong  wrote:
>>
>> Hello everyone,
>>
>> I have a global dmda vector vg. On each processor, if I want to know the 
>> norm of local portion of vg, which function should I call?
>>
>> So far I am thinking of using DMDAVecGetArray and then write a loop to 
>> compute the norm of this local array.
>>
>> Is there a simple function available to call? like 
>> *vg->ops->norm_local(vg,NORM_2, )?
>
> There isn't a public interface to this call because it really isn't a 
> mathematically well defined object; the subdomains in the decomposition of 
> the array are arbitrary based on the number of processes used.
>
>Anyways if you want it and it is the NON-overlapping portion then yes, you 
> can write a little routine (basically just cut and paste VecNorm()) call it 
> say VecNormLocal() and have it call the function pointer you indicated above. 
> Note for the 2 norm the norm_local() returns the square of the norm so you 
> need to take the square root.
>
>If you want the overlapping portion of the vector then you should just do 
> the DMDAVecGetArray() as you already do.
>
>Barry
>
>
>
>>
>> Thanks.
>>
>> Best,
>> Xiangdong
>>
>

Re: [petsc-users] [RFC] Docs: TeX -> HTML

2016-07-23 Thread Patrick Sanan

I have slowly been doing some work to clean up the manual a little
bit, mainly just fixing the formatting where it needs attention, but
also updating the content where it is obviously out of date, so I'm
interested in working on resolving this. The latex version is of
course nice in that it can look pretty with latex tools, but the
advantage of having html documentation which is more friendly to
search engines is undeniable.

Which latex packages are giving trouble? Maybe we can figure out a way
to sufficiently reduce the dependencies.

On Sat, Jul 23, 2016 at 5:25 AM, Marco Zocca  wrote:
> Dear all,
>
>   following the discussion at PETSc'16, I have tried to render the
> TeX-based manual into HTML with latex2html [1] and pandoc [2] .
>
> Neither attempt was successful, because of the presence of certain
> external TeX packages used for rendering various custom aspects of the
> manual.
>
> There is no 1:1 way of converting such a document. However there are a
> number of templates for rendering static websites that use LaTeX math
> and verbatim source code (e.g. readthedocs [3] for manual-type
> documents, which also supports MathJax [4] and re-renders at every
> repository push).
>
> At any rate, the conversion requires copying blocks of text and code
> to the web-based version, i.e. removing all the LaTeX markup,
> therefore effectively committing to maintaining 2 versions of the
> manual up to date and in sync with each other.
>
>
> Before committing to any approach, I would like your input on this:
>
> 1) Do you have any preference for web rendering/site hosting solution?
>
> 2) Are you OK with the idea of essentially forking the manual into PDF
> output and web output ? It is not huge work (an afternoon of tweaking
> initially and a couple minutes at every new release) but we should be
> sure about the approach in the first place.
>
> Any and all feedback is welcome;
>
> Thank you and kind regards,
> Marco
>
>
> [1] https://www.ctan.org/tex-archive/support/latex2html/
> [2] http://pandoc.org/
> [3] https://readthedocs.org/
> [4] http://mathjax.readthedocs.io/en/latest/tex.html

Re: [petsc-users] Functionality in Petsc

2016-06-06 Thread Patrick Sanan

Assuming that you don't want to store this dense rank-1 matrix explicitly, one 
way to go is to define your own MATSHELL which holds your two vectors and can 
apply the outer product to a vector.



> Am 06.06.2016 um 19:49 schrieb Sophie Léger :
> 
> Hello,
>  
> I’m trying to multiply a column vector with a line vector (which would result 
> in a matrix) and am wondering if this is possible with Petsc? I have looked, 
> but have not found anything yet. Would it be possible to let me know if such 
> a functionality exists and if so guide me in the right direction?
>  
> Thank you very much for your help!
> Sophie Leger
> Mathematics professor – Université de Moncton

Re: [petsc-users] MOOSE_SNES_ISSUES

2016-03-01 Thread Patrick Sanan

On Tue, Mar 01, 2016 at 09:58:18AM +0100, alena kopanicakova wrote:
> Hello,
> 
> I am developing my own nonlinear solver, which aims to solve simulations
> from MOOSE. Communication with moose is done via SNES interface.
> 
> I am obtaining Jacobian and residual in following way:
> 
>SNESComputeFunction(snes, X, R);
> 
>   SNESSetJacobian(snes, jac, jac, SNESComputeJacobianDefault,
As here, 
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESComputeJacobianDefault.html
 , 
it seems like you are computing the Jacobian in a brute-force way with finite 
differences. 
Do you have an expression to compute the Jacobian?
> NULL);
>   SNESComputeJacobian(snes, X, jac,  jac);
> 
> Unfortunately, by this setting it takes incredible amount of time to obtain
> Jacobian.
> Taking closer look at perf log, it seems that difference between mine and
> MOOSE solver is, that my executioner calls compute_residual() function
> ridiculously many times.
> I have no idea, what could be causing such a behavior.
> 
> Do you have any suggestions, how to set up interface properly? or which
> change is needed for obtaining more-less same performance as MOOSE
> executioner?
> 
> 
> many thanks,
> alena

> 
> Framework Information:
> MOOSE version:   git commit e79292e on 2016-01-05
> PETSc Version:   3.6.1
> Current Time:Tue Mar  1 00:51:45 2016
> Executable Timestamp:Tue Mar  1 00:36:30 2016
> 
> Parallelism:
>   Num Processors:  1
>   Num Threads: 1
> 
> Mesh: 
>   Distribution:serial
>   Mesh Dimension:  3
>   Spatial Dimension:   3
>   Nodes:   
> Total: 1331
> Local: 1331
>   Elems:   
> Total: 1000
> Local: 1000
>   Num Subdomains:  1
>   Num Partitions:  1
> 
> Nonlinear System:
>   Num DOFs:3993
>   Num Local DOFs:  3993
>   Variables:   { "disp_x" "disp_y" "disp_z" } 
>   Finite Element Types:"LAGRANGE" 
>   Approximation Orders:"FIRST" 
> 
> Execution Information:
>   Executioner: Steady
>   Solver Mode: NEWTON
> 
> 
> 
>  0 Nonlinear |R| = [32m2.85e-02[39m
>   0 Linear |R| = [32m2.85e-02[39m
>   1 Linear |R| = [32m1.653670e-02[39m
>   2 Linear |R| = [32m1.447168e-02[39m
>   3 Linear |R| = [33m1.392965e-02[39m
>   4 Linear |R| = [32m1.258440e-02[39m
>   5 Linear |R| = [32m1.007181e-02[39m
>   6 Linear |R| = [32m8.264315e-03[39m
>   7 Linear |R| = [32m6.541897e-03[39m
>   8 Linear |R| = [32m4.371900e-03[39m
>   9 Linear |R| = [32m2.100406e-03[39m
>  10 Linear |R| = [32m1.227539e-03[39m
>  11 Linear |R| = [32m1.026286e-03[39m
>  12 Linear |R| = [32m9.180101e-04[39m
>  13 Linear |R| = [32m8.087598e-04[39m
>  14 Linear |R| = [32m6.435247e-04[39m
>  15 Linear |R| = [32m5.358688e-04[39m
>  16 Linear |R| = [32m4.551657e-04[39m
>  17 Linear |R| = [32m4.090276e-04[39m
>  18 Linear |R| = [32m3.359290e-04[39m
>  19 Linear |R| = [32m2.417643e-04[39m
>  20 Linear |R| = [32m1.710452e-04[39m
>  21 Linear |R| = [32m1.261996e-04[39m
>  22 Linear |R| = [32m9.384052e-05[39m
>  23 Linear |R| = [32m6.070637e-05[39m
>  24 Linear |R| = [32m4.283233e-05[39m
>  25 Linear |R| = [32m3.383792e-05[39m
>  26 Linear |R| = [32m2.342289e-05[39m
>  27 Linear |R| = [32m1.700855e-05[39m
>  28 Linear |R| = [32m9.814278e-06[39m
>  29 Linear |R| = [32m4.398519e-06[39m
>  30 Linear |R| = [32m2.161205e-06[39m
>  31 Linear |R| = [32m1.289206e-06[39m
>  32 Linear |R| = [32m6.548007e-07[39m
>  33 Linear |R| = [32m3.677894e-07[39m
>  34 Linear |R| = [32m2.640006e-07[39m
>  1 Nonlinear |R| = [32m2.400310e-02[39m
>   0 Linear |R| = [32m2.400310e-02[39m
>   1 Linear |R| = [32m9.102075e-03[39m
>   2 Linear |R| = [32m5.017381e-03[39m
>   3 Linear |R| = [32m3.840732e-03[39m
>   4 Linear |R| = [32m2.990685e-03[39m
>   5 Linear |R| = [32m1.990203e-03[39m
>   6 Linear |R| = [32m1.085764e-03[39m
>   7 Linear |R| = [32m4.657779e-04[39m
>   8 Linear |R| = [32m3.049692e-04[39m
>   9 Linear |R| = [32m1.625839e-04[39m
>  10 Linear |R| = [32m1.124700e-04[39m
>  11 Linear |R| = [32m7.764153e-05[39m
>  12 Linear |R| = [32m5.698577e-05[39m
>  13 Linear |R| = [32m4.581843e-05[39m
>  14 Linear |R| = [32m4.262610e-05[39m
>  15 Linear |R| = [32m3.792804e-05[39m
>  16 Linear |R| = [32m3.404168e-05[39m
>  17 Linear |R| = [32m2.536004e-05[39m
>  18 Linear |R| = [32m1.577559e-05[39m
>  19 Linear |R| = [32m9.099392e-06[39m
>  20 Linear |R| = [32m6.140685e-06[39m
>  21 Linear |R| = [32m5.083606e-06[39m
>  22 Linear |R| =

Re: [petsc-users] Editor for PETSc

2016-02-05 Thread Patrick Sanan

If you're using vim, using ctags (you want "exuberant ctags") is a quick and 
handy way to 
look around in the PETSc source (with its included man pages).
See the manual, http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf, 
Section 15.9 . 
(and if you use emacs, see the preceding Section 15.8 for instructions on tags 
and more).
On Fri, Feb 05, 2016 at 02:27:49PM +0530, Kaushik Kulkarni wrote:
> Hello,
> I am a newbie to PETSc. I am going thorugh the tutorials right now. But I
> am struggling a bit in finding the correct editor for me. Till now I was
> fine with VIM, as it gave me a bunch of completions options in the form of
> a drop down box. And using the Clang support feature it would also
> highlight the errors hence drastically decreasing my errors. Is there any
> way I could continue working on such environment.
> 
> I was using the "YouCompleteMe"  plugin with VIM which had all of these
> features.
> 
> Thanks,
> *Kaushik*


pgpParHgsX4wl.pgp
Description: PGP signature

Re: [petsc-users] Array of SNES's

2016-01-05 Thread Patrick Sanan

Do all the SNES's need to be constructed at the same time? It will
obviously require a lot of memory to store N SNES objects (or perhaps
your N is small), and if they don't all need to exist simultaneously, then do 
you have the option to
create and destroy one at a time as you loop over your grid points?
On Tue, Jan 05, 2016 at 09:24:11PM -0700, Justin Chang wrote:
> Timothee,
> 
> No i haven't tried, mainly because I don't know how. Btw I am not doing
> this in C or FORTRAN, I want to do this in python (via petsc4py) since I am
> trying to make this compatible with Firedrake (which is also python-based).
> 
> Thanks,
> Justin
> 
> On Tue, Jan 5, 2016 at 8:57 PM, Timothée Nicolas  > wrote:
> 
> > Hello and happy new year,
> >
> > Have you actually tried ? I just declared an array of 10 snes and created
> > them, and there is no complaint whatsoever. Also, something I do usually is
> > that I declare a derived type which contains some Petsc Objects (like SNES,
> > KSP, matrices, vectors, whatever), and create arrays of this derived types.
> > This works perfectly fine in my case (I use FORTRAN btw).
> >
> > Best wishes
> >
> > Timothee
> >
> >
> > 2016-01-06 11:53 GMT+09:00 Justin Chang :
> >
> >> Hi all,
> >>
> >> Is it possible to create an array of SNES's? If I have a problem size N
> >> degrees of freedom, I want each dof to have its own SNES solver (basically
> >> a pointer to N SNES's). Reason for this is because I am performing a
> >> "post-processing" step where after my global solve, each entry of my
> >> solution vector of size N will go through some algebraic manipulation.
> >>
> >> If I did a standard LU solve for these individual SNES's, I could use the
> >> same snes and this issue would be moot. But i am using the Variational
> >> Inequality, which basically requires a fresh SNES for each problem.
> >>
> >> Thanks,
> >> Justin
> >>
> >
> >

Re: [petsc-users] Solving/creating SPD systems

2015-11-28 Thread Patrick Sanan

On Sat, Nov 28, 2015 at 06:31:31AM -0600, Matthew Knepley wrote:
> On Sat, Nov 28, 2015 at 12:10 AM, Justin Chang  wrote:
> 
> > Hi all,
> >
> > Say I have a saddle-point system for the mixed-poisson equation:
> >
> > [I  -grad] [u]  = [0]
> > [-div  0  ] [p] [-f]
> >
> > The above is symmetric but indefinite. I have heard that one could make
> > the above symmetric and positive definite (SPD). How would I do that? And
> > if that's the case, would this allow me to use CG instead of GMRES?
> >
> 
> I believe you just multiply the bottom row by -1. You can use CG for an SPD
> system, but you can
> use MINRES for symmetric indefinite.
If I'm remembering correctly, flipping that sign lets you make your system 
alternately P.D. or
symmetric, but not both. Maybe you were hearing about the Bramble-Pasciak 
preconditioner or a related approach?
> 
>Matt
> 
> 
> > Thanks,
> > Justin
> >
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener


pgpOysVnWyqLA.pgp
Description: PGP signature

Re: [petsc-users] Parallel ODE solver

2015-10-20 Thread Patrick Sanan

Further note that you can use sundials from within PETSc, without any
more hassle than supplying a configure flags (--download-sundials, or
see the other entries under SUNDIALS: with ./configure --help).
That could potentially be very helpful in moving to TS.

http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/TS/TSSUNDIALS.html


On Tue, Oct 20, 2015 at 09:21:04AM +0200, Julian Andrej wrote:
> There are a variety of algorithms for ODE/DAEs in the PetscTS
> implementation. For examples see e.g.
> http://www.mcs.anl.gov/petsc/petsc-current/src/ts/examples/tutorials/index.html
> 
> Sundials is also capable of parallel solving, even from the Matlab
> sundialsTB interface, as far as i know.
> 
> On Tue, Oct 20, 2015 at 9:10 AM, Sahai, Amal  wrote:
> > I am trying to solve a very large system of ODEs for modeling the
> > time-varying composition of a homogeneous (no spatial variation) reaction
> > mixture composed of 5 species. I was wondering if there is parallel ode
> > solver included in PETSc? I have been using the sundials package for solving
> > a smaller system in serial and would like to move on to parallel framework
> > to speed up my calculations.
> >
> > Regards
> > Amal


pgpmWt8cEpe6J.pgp
Description: PGP signature

Re: [petsc-users] forming a matrix from a set of vectors

2015-08-26 Thread Patrick Sanan

On Wed, Aug 26, 2015 at 11:02 PM, Patrick Sanan patrick.sa...@gmail.com
wrote:



 On Wed, Aug 26, 2015 at 10:41 PM, Nicolas Pozin nicolas.po...@inria.fr
 wrote:

 Actually I want to get the diagonal of the matrix : transpose(d)*A*d where
 -d is a sparse matrix of size (n1,m1)
 -A is a dense symetric matrix of size size (n1,n1)
 with m1 very big compared to n1 (1 million against a few dozens).

 If I read this correctly, another way to phrase what you need is
 ||d_i||_A^2 = d_i,Ad_i, for a few dozen values of i . Naively you could
 do that by iterating through an array of Vec objects (which need not all be
 stored in memory simultaneously), calling MatMult followed by VecDot. You
 could perhaps get more clever later (if the size of the system justifies
 it) by doing things like using non-blocking/split versions of VecDot (or
 VecMDot) so that you can overlap the matrix multiplications with the dot
 products.

Ah, sorry, I had the sparsity of A and d reversed in my reading.


 The problem is too big to allow the use of MatMatMult.
 What I planned to do :
 -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th
 column of A : quick since d is sparse and n1 is small
 -deduce the matrix transpose(d)*A = [V1 ... Vn]
 and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through
 -transpose([V1 ...Vn]) and get its columns C1 ... Cn
 -conclude on the i-th diagonal value which is the i-th component of
 tranpose(d)*Ci



 - Mail original -
  De: Barry Smith bsm...@mcs.anl.gov
  À: Nicolas Pozin nicolas.po...@inria.fr
  Cc: Jed Brown j...@jedbrown.org, petsc-users@mcs.anl.gov
  Envoyé: Mercredi 26 Août 2015 22:21:04
  Objet: Re: [petsc-users] forming a matrix from a set of vectors
 
 
   On Aug 26, 2015, at 3:06 PM, Nicolas Pozin nicolas.po...@inria.fr
 wrote:
  
   Thank you for this answer.
  
   What I want to do is to get the lines of this matrix and store them in
   vectors.
 
If you want to treat the columns of the dense matrix as vectors then
 use
MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to
 the
first row of each column of the obtained array (PETSc dense matrices
 are
stored by column; same as for example LAPACK).
 
But if you explained more why you want to treat something sometimes
 as a
Mat (which is a linear operator on vectors) and sometimes as vectors
 we
might be able to suggest how to organize your code.
 
 Barry
 
  
  
   - Mail original -
   De: Jed Brown j...@jedbrown.org
   À: Nicolas Pozin nicolas.po...@inria.fr, petsc-users@mcs.anl.gov
   Envoyé: Mercredi 26 Août 2015 20:38:37
   Objet: Re: [petsc-users]  forming a matrix from a set of vectors
  
   Nicolas Pozin nicolas.po...@inria.fr writes:
   Given a set of vectors V1, V2,...,Vn, is there an efficient way to
 form
   the
   dense matrix [V1 V2 ... Vn]?
  
   What do you want to do with that matrix?  The vector representation
 is
   pretty flexible and the memory semantics are similar unless you store
   the dense matrix row-aligned (not the default).

Re: [petsc-users] I am wondering if there is a way to implement SPMM

2015-08-04 Thread Patrick Sanan

On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
 Thanks for your reply.
 
 I have an other question.
 I want to do SPMM several times and combine result matrices into one bigger
 matrix.
 for example
 I firstly calculate AX1=B1, AX2=B2 ...
 then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
 
 Could you please suggest a way of how to do this.
This is just linear algebra, nothing to do with PETSc specifically.
A * [X1, X2, ... ] = [AX1, AX2, ...] 
 
 Thanks
 
 Cong Li
 
 On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown j...@jedbrown.org wrote:
 
  Cong Li solvercorle...@gmail.com writes:
 
   Hello,
  
   I am a PhD student using PETsc for my research.
   I am wondering if there is a way to implement SPMM (Sparse matrix-matrix
   multiplication) by using PETSc.
 
 
  http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
 


pgpviaoY3oXSA.pgp
Description: PGP signature

Re: [petsc-users] I am wondering if there is a way to implement SPMM

2015-08-04 Thread Patrick Sanan

On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote:
 I am sorry that I should have explained it more clearly.
 Actually I want to compute a recurrence.
 
 Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2,
 A*B2=B3 and so on.
 Finally I want to combine all these results into a bigger matrix C=[B1,B2
 ...]
 
 Is there any way to do this efficiently.
With no other information about your problem, one literal solution might be to 
use MATNEST to define C once you have computed B1,B2,.. 
However, this invites questions about what you plan to do with C and whether 
you require explicit representations of some or all of these matrices, and what 
problem sizes you are considering. 
 
 
 
 On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan patrick.sa...@gmail.com
 wrote:
 
  On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote:
   Thanks for your reply.
  
   I have an other question.
   I want to do SPMM several times and combine result matrices into one
  bigger
   matrix.
   for example
   I firstly calculate AX1=B1, AX2=B2 ...
   then I want to combine B1, B2.. to get a C, where C=[B1,B2...]
  
   Could you please suggest a way of how to do this.
  This is just linear algebra, nothing to do with PETSc specifically.
  A * [X1, X2, ... ] = [AX1, AX2, ...]
  
   Thanks
  
   Cong Li
  
   On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown j...@jedbrown.org wrote:
  
Cong Li solvercorle...@gmail.com writes:
   
 Hello,

 I am a PhD student using PETsc for my research.
 I am wondering if there is a way to implement SPMM (Sparse
  matrix-matrix
 multiplication) by using PETSc.
   
   
   
  http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html
   
 


pgp8FWItspegE.pgp
Description: PGP signature

Re: [petsc-users] Profiling/checkpoints

2015-08-04 Thread Patrick Sanan

And note that it is possible to run gdb/lldb on each of several MPI processes, 
useful when you hit a bug that only appears in parallel.  For example, this FAQ 
describes a couple of ways to do this:

https://www.open-mpi.org/faq/?category=debugging#serial-debuggers


 Am 05.08.2015 um 00:36 schrieb Barry Smith bsm...@mcs.anl.gov:
 
 
  Correction, even in parallel you should be able to use a 0 for the viewer 
 for calls to KSPView() etc; just make sure you do the same call on each 
 process that shares the object. 
 
   To change the viewer format you do need to use 
 PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD), 
 PETSC_VIEWER_ASCII_INFO)  to change the format for parallel objects that live 
 on PETSC_COMM_WORLD.
 
 
  Barry
 
 PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) only effects the format of 
 the sequential ASCII viewer.
 
 On Aug 4, 2015, at 5:22 PM, Barry Smith bsm...@mcs.anl.gov wrote:
 
 
 I do this by running in the debugger and putting in breakpoints. At the 
 breakpoint you can look directly at variables like the n in call to 
 VecMDot() you can also call KSPView() etc on any PETSc object (with a viewer 
 of 0) and it will print out the information about the object right then. 
 Calling VecView() or MatView() directly will of course cause it to print the 
 entire object which generally you don't want but you can do 
 PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or MatView 
 to have it print size information etc about the object instead of the full 
 object. In parallel instead of passing 0 for the viewer you need to pass 
 PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes that 
 share the object call the routine in the debugger but it is possible.
 
 Let us know how it goes and we can try to improve the experience with your 
 suggestions,
 
 Barry
 
 On Aug 4, 2015, at 5:09 PM, Justin Chang jychan...@gmail.com wrote:
 
 Hi all,
 
 Not sure what to title this mail, but let me begin with an analogy of what 
 I am looking for:
 
 In MATLAB, we could insert breakpoints into the code, such that when we run 
 the program, we could pause the execution and see what the variables 
 contain and what is going on exactly within your function calls. Is there a 
 way to do something like this within PETSc?
 
 I want to see what's going on within certain PETSc functions within 
 KSPSolve. For instance, -log_summary says that my solver invokes calls to 
 functions like VecMDot and VecMAXPY but I would like to know exactly how 
 many vectors each of these functions are working with. Morever, I would 
 also like to get a general overview of the properties of the matrices 
 MatPtAP and MatMatMult are playing with (e.g., dimensions, number of 
 nonzeros, etc).
 
 Or
 
 Above functions happen to be invoked from gamg, so is it possible to tell 
 just from the parameters fed into PETSc what the answers to the above may 
 be?
 
 Thanks,
 Justin

Re: [petsc-users] MatMatMult involving MPIAIJ and MATNEST

2015-05-11 Thread Patrick Sanan

MatMatMult with MATNEST does not seem to be supported, based only on the
fact that there are no functions of the form MatMatMult_*_MatNest defined
with the MATNEST implementation :
http://www.mcs.anl.gov/petsc/petsc-current/src/mat/impls/nest/matnest.c.html

Does this operation need to be efficient? (That is, are you forming this
matrix for debugging or experimental purposes, or with the intention of
using it within an efficient, scalable piece of code?). If not, it should
be possible to use lower-level matrix operations as defined by the API to
extract the appropriately-sized submatrices from A and (assuming that
MatMatMult is defined between MATMPIAIJ and the submatrices of your
Matnest), perform the matrix multiplications and additions by hand .

On Mon, May 11, 2015 at 2:44 PM, Bikash Kanungo bik...@umich.edu wrote:

 Hi,

 I have two matrices: A of type MPIAIJ and B of type MATNEST. Is there any
 way to perform A*B after B has been assembled from its sub-matrices?

 Thanks,
 Bikash

 --
 Bikash S. Kanungo
 PhD Student
 Computational Materials Physics Group
 Mechanical Engineering
 University of Michigan

Re: [petsc-users] MatMatMult involving MPIAIJ and MATNEST

2015-05-11 Thread Patrick Sanan

Are you storing M^{-1} explicitly? In that case you could use MATCOMPOSITE
to define M^{-1}H and feed that to SLEPc's standard solver. If you only
have M explicitly, you could use MATSHELL to define your own operator which
applies M^{-1}H and use that with SLEPc's standard solver (though of course
SLEPc would be free to implement its generalized eigensolvers this way
and/or in better ways for certain problems - I am not enough of an expert
on that library to say more).

On Mon, May 11, 2015 at 7:12 PM, Bikash Kanungo bik...@umich.edu wrote:

 Hi Matthew,

 Yes SLEPc does allow me to solve a generalized eigenvalue problem. But the
 generalized solvers are not as robust as the standard eigenvalue solvers.
 That's the reason I want to use explicit MatMatMult so that I can use
 standard eigenvalue solvers.

 Thanks,
 Bikash
 On May 11, 2015 1:00 PM, Matthew Knepley knep...@gmail.com wrote:

 On Mon, May 11, 2015 at 8:47 AM, Bikash Kanungo bik...@umich.edu wrote:

 Hi Patrick,

 Thanks for the clarification. I want it to be an efficient operation.
 The idea is to convert a generalized eigenvalue problem (H*x = lamda*M*x)
 to a standard one (M^{-1}*H*x = lambda*x). The M^{-1} is of type MATNEST
 whereas H is MPIAIJ. In my problem M^{-1} is computed once whereas H
 changes every iteration. So performing low-level matrix multiplication or
 creating H as MATNEST and assembling it from its sub-matrices at every
 iteration sounds inefficient.


 You can solve these types of things with SLEPc without explicit MatMat.

Matt


 Regards,
 Bikash

 On Mon, May 11, 2015 at 9:21 AM, Patrick Sanan patrick.sa...@gmail.com
 wrote:

 MatMatMult with MATNEST does not seem to be supported, based only on
 the fact that there are no functions of the form MatMatMult_*_MatNest
 defined with the MATNEST implementation :
 http://www.mcs.anl.gov/petsc/petsc-current/src/mat/impls/nest/matnest.c.html

 Does this operation need to be efficient? (That is, are you forming
 this matrix for debugging or experimental purposes, or with the intention
 of using it within an efficient, scalable piece of code?). If not, it
 should be possible to use lower-level matrix operations as defined by the
 API to extract the appropriately-sized submatrices from A and (assuming
 that MatMatMult is defined between MATMPIAIJ and the submatrices of your
 Matnest), perform the matrix multiplications and additions by hand .

 On Mon, May 11, 2015 at 2:44 PM, Bikash Kanungo bik...@umich.edu
 wrote:

 Hi,

 I have two matrices: A of type MPIAIJ and B of type MATNEST. Is there
 any way to perform A*B after B has been assembled from its sub-matrices?

 Thanks,
 Bikash

 --
 Bikash S. Kanungo
 PhD Student
 Computational Materials Physics Group
 Mechanical Engineering
 University of Michigan





 --
 Bikash S. Kanungo
 PhD Student
 Computational Materials Physics Group
 Mechanical Engineering
 University of Michigan




 --
 What most experimenters take for granted before they begin their
 experiments is infinitely more interesting than any results to which their
 experiments lead.
 -- Norbert Wiener

Re: [petsc-users] MatMatMult involving MPIAIJ and MATNEST

2015-05-11 Thread Patrick Sanan

(forgot to reply-all)
I think that depends on how sparse M^{-1} and H are.



On Mon, May 11, 2015 at 7:41 PM, Bikash Kanungo bik...@umich.edu wrote:

 Hi Patrick,

 I have M^{-1} computed explicitly and stored as MATNEST. Providing
 MATCOMPOSITE of M^{-1}H to SLEPc standard eigenvalue solvers seems an
 option. Does it has any additional cost over explicitly building M^{-1}H
 (assuming I have a way to do so) and then providing it to standard
 eigenvalue solvers?

 Thanks,

 Bikash

 On Mon, May 11, 2015 at 1:37 PM, Bikash Kanungo bik...@umich.edu wrote:

 Hi Patrick,

 I have M^{-1} computed explicitly and stored as MATNEST. Providing
 MATCOMPOSITE of M^{-1}H to SLEPc standard eigenvalue solvers seems an
 option. Does it has any additional cost of explicitly building
 On May 11, 2015 1:30 PM, Patrick Sanan patrick.sa...@gmail.com wrote:

 Are you storing M^{-1} explicitly? In that case you could use
 MATCOMPOSITE to define M^{-1}H and feed that to SLEPc's standard solver. If
 you only have M explicitly, you could use MATSHELL to define your own
 operator which applies M^{-1}H and use that with SLEPc's standard solver
 (though of course SLEPc would be free to implement its generalized
 eigensolvers this way and/or in better ways for certain problems - I am not
 enough of an expert on that library to say more).

 On Mon, May 11, 2015 at 7:12 PM, Bikash Kanungo bik...@umich.edu
 wrote:

 Hi Matthew,

 Yes SLEPc does allow me to solve a generalized eigenvalue problem. But
 the generalized solvers are not as robust as the standard eigenvalue
 solvers. That's the reason I want to use explicit MatMatMult so that I can
 use standard eigenvalue solvers.

 Thanks,
 Bikash
 On May 11, 2015 1:00 PM, Matthew Knepley knep...@gmail.com wrote:

 On Mon, May 11, 2015 at 8:47 AM, Bikash Kanungo bik...@umich.edu
 wrote:

 Hi Patrick,

 Thanks for the clarification. I want it to be an efficient operation.
 The idea is to convert a generalized eigenvalue problem (H*x = lamda*M*x)
 to a standard one (M^{-1}*H*x = lambda*x). The M^{-1} is of type MATNEST
 whereas H is MPIAIJ. In my problem M^{-1} is computed once whereas H
 changes every iteration. So performing low-level matrix multiplication or
 creating H as MATNEST and assembling it from its sub-matrices at every
 iteration sounds inefficient.


 You can solve these types of things with SLEPc without explicit MatMat.

Matt


 Regards,
 Bikash

 On Mon, May 11, 2015 at 9:21 AM, Patrick Sanan 
 patrick.sa...@gmail.com wrote:

 MatMatMult with MATNEST does not seem to be supported, based only on
 the fact that there are no functions of the form MatMatMult_*_MatNest
 defined with the MATNEST implementation :
 http://www.mcs.anl.gov/petsc/petsc-current/src/mat/impls/nest/matnest.c.html

 Does this operation need to be efficient? (That is, are you forming
 this matrix for debugging or experimental purposes, or with the 
 intention
 of using it within an efficient, scalable piece of code?). If not, it
 should be possible to use lower-level matrix operations as defined by 
 the
 API to extract the appropriately-sized submatrices from A and (assuming
 that MatMatMult is defined between MATMPIAIJ and the submatrices of your
 Matnest), perform the matrix multiplications and additions by hand .

 On Mon, May 11, 2015 at 2:44 PM, Bikash Kanungo bik...@umich.edu
 wrote:

 Hi,

 I have two matrices: A of type MPIAIJ and B of type MATNEST. Is
 there any way to perform A*B after B has been assembled from its
 sub-matrices?

 Thanks,
 Bikash

 --
 Bikash S. Kanungo
 PhD Student
 Computational Materials Physics Group
 Mechanical Engineering
 University of Michigan





 --
 Bikash S. Kanungo
 PhD Student
 Computational Materials Physics Group
 Mechanical Engineering
 University of Michigan




 --
 What most experimenters take for granted before they begin their
 experiments is infinitely more interesting than any results to which their
 experiments lead.
 -- Norbert Wiener





 --
 Bikash S. Kanungo
 PhD Student
 Computational Materials Physics Group
 Mechanical Engineering
 University of Michigan

Re: [petsc-users] cached values

2015-04-26 Thread Patrick Sanan

This might be more basic than what you're trying to find out, but the
reason that assembly is required in the first place (even for serial
code) is that that during the process of defining the matrix, it's useful
to be able to insert values anywhere in the matrix. I'm not sure what PETSc
does precisely with digging into the source, but a common way to approach
things is to maintain a data structure which looks like a C array of
(row,col,val) entries (if you have ever used the Eigen library, they call
these Triples and you can work with C++ STL vector of them). This is the
COO (COOrdinate list) format for storing a sparse matrix. Typically one
uses a more highly-compressed format, like CSR (Compressed Row Storage) ,
which does not allow of efficient insertion and deletion of entries. This
tension motivates requiring the user to explicitly assemble the matrix.

On Sun, Apr 26, 2015 at 7:13 PM, Jed Brown j...@jedbrown.org wrote:

 Justin Chang jychan...@gmail.com writes:

  Jed,
 
  Thank you for the response. But where exactly are the values from
  MatSetValues() stored or located? That is, are they in main memory, or
 are
  they occupying space in the L1/L2/L3 cache?

 Caches are not programmable in that way.  PETSc incrementally allocates
 a data structure that holds the stashed values.  It depends on the
 details of your application and the hardware cache semantics whether a
 particular access will trigger a cache miss or not.  You can find out
 using hardware performance counters, but it's not likely to be worth
 your time unless you just want to learn those tools.

Re: [petsc-users] The parallel preconditioner

2015-04-17 Thread Patrick Sanan

You can find the current list of preconditioners on line 40 here:

http://www.mcs.anl.gov/petsc/petsc-current/include/petscpc.h.html

You can click on the names to be taken to man pages.

However, since preconditioners are by nature problem-dependent, it's a better 
idea to try to look up what might be an appropriate preconditioner for your 
system and then see how to use PETSc to apply it. There's a high probability 
that what you need is supported either natively by PETSc or as Jed's note 
demonstrates, through interfaces to external libraries (of which there are a 
large number - run ./configure -help to get an idea of the options available 
and the configure flags to use them)


 Il giorno 16/apr/2015, alle ore 23:54, huaibao zhang 
 huaibao.zh...@gmail.com ha scritto:
 
 Hello there,
 
 I have been confused of the extent of the available preconditioners in PETSc 
 for a while. Here are the lists I got from one slide of a lecture notes saying
 
 Block Jacobi,
 Jacobi,
 Overlapping Additive Schwarz,
 ICC, ILU (sequential only)
 ILU(K), LU(direct solver, sequential only)
 sor
 etc.
 
 
 But in some other sources, only LU is highlighted with sequential only. 
 Which one is correct? I am using MATCreateAIJ to construct my A. Which 
 preconditioner is available to me? By the way, I  tested asm, ilu, icc, and 
 they all failed. 
 
 Thanks,
 Paul

Re: [petsc-users] PETSc for distributed computing?

2015-04-09 Thread Patrick Sanan

Do you want an explicit inverse, or just to solve linear systems involving a 
dense matrix? If so, how many right hand sides?

PETSc itself focuses on sparse linear algebra, but it includes an interface to 
the dense linear algebra library Elemental. 

Il giorno 09/apr/2015, alle ore 03:06 PM, Georgios Samaras 
georgesamaras...@gmail.com ha scritto:

 Dear all,
 
   I am looking for a C/C++ linear algebra package for performing inversion of 
 a large dense matrix in a distributed environment. Is your package suitable 
 for this goal?
 
 Thanks in advance,
 Georgios Samaras

Re: [petsc-users] PETSc 3.5.2 on Mac OS X Yosemite, configuring gives C preprocessor error

2015-01-18 Thread Patrick Sanan



On 18/01/15 09:48 PM, Barry Smith wrote:

   We need configure.log

You do not need --download-blaslapack Apple always has them
As of OS X 10.9 the BLAS Apple provided was known to have bugs. I only 
noticed any issue (If I remember correctly, make test failed) when 
configuring in single precision. However, this might be a reason to have 
petsc download BLAS/Lapack despite the libraries being available in 
Apple's framework.


Suggest just running without passing the compiler names also.

   Barry


On Jan 18, 2015, at 2:45 PM, Justin Dong jdong...@gmail.com wrote:

Hi all,

I just updated to OS X Yosemite. I have the latest XCode and also reinstalled 
command line developer tools.

When I try to install PETSc, I get the following error:

./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran 
--download-fblaslapack --download-mpich --with-debugging=0

===

  Configuring PETSc to compile on your system

===

TESTING: checkCPreprocessor from 
config.setCompilers(config/BuildSystem/config/setCompilers.py:568)  
  
***

  UNABLE to CONFIGURE with GIVEN OPTIONS(see configure.log for 
details):

---

Cannot find a C preprocessor

***



Can someone explain to me what's going on here?



Thanks,

Justin

[petsc-users] Dumping KSP solve information to a file

2014-12-03 Thread Patrick Sanan

I'd like to be able to obtain and process post-solve information about 
KSP solves, after a PETSc application has run.  Specifically, I would 
like the information produced by -ksp_converged_reason, as well as the 
final residual norm, to be available as a file for post-processing.


A direct approach is to modify the application source, adding my own 
code invoking KSPGetIterationNumber(), KSPGetConvergedReason(), 
KSPGetResidualNorm(), etc. to the source, and producing the required 
output with an ASCII viewer.


However, it'd be more convenient to be able to obtain the required 
information without modifying the application source. An inelegant and 
potentially fragile approach is to provide flags like 
-ksp_converged_reason, -ksp_monitor, etc. and write a script to extract 
the required information from what is dumped to stdout.


My first thought on how to do things properly is to link in my own 
custom KSP viewer; should that be a viable approach? Is there a 
simpler/better method?


-Patrick

Re: [petsc-users] How to use

2014-10-02 Thread Patrick Sanan



On 10/2/14 7:26 PM, Sharp Stone wrote:

Hi Matt and Jed,

Thank you very much for your replies. In source code ex19, the routine 
SNESSetNGS() is used, and no other options or Jacobi routines are 
used. How can we change the iteration method? The routine 
SNESSetFromOptions(snes) could do this for the user? If I understand 
correctly, petsc would use Gauss-Seidel or Jacobi method as designated 
with even the same source code?
Yes, and this applies for other objects as well. The important thing, in 
general, is to call XXXSetFromOptions after you have set any default 
options with XXXSetYYY, but before you call XXXSetUp (which does things 
like allocate memory). This allows you to set whichever defaults make 
sense, and also control things from the command line.


Thank you in advance!

On Thu, Oct 2, 2014 at 1:12 PM, Jed Brown j...@jedbrown.org 
mailto:j...@jedbrown.org wrote:


Sharp Stone thron...@gmail.com mailto:thron...@gmail.com writes:

 Matt,

 Thank you very much!

 The ex19 used the Gauss-Seidel method. But I'm a little bit
wonder whether
 we have any examples of Jacobi method to solve such linear
systems (mat)A
 dot (vec)X = (vec)B with dof1? Thanks in advance!

ex19 does whatever you want: -pc_type jacobi and the like.




--
Best regards,

Feng

Re: [petsc-users] MKL Pardiso with MPI

2014-09-30 Thread Patrick Sanan

You should include make.log and configure.log (to answer the question of 
whether you configured with --download-mkl_pardiso, amongst others) .


On 9/30/14 12:03 PM, Baros Vladimir wrote:


Hi,

Intel released MKL 11.2 with support for Pardiso cluster version.

https://software.intel.com/en-us/articles/intel-math-kernel-library-parallel-direct-sparse-solver-for-clusters

Is it possible to use Pardiso with PETSC using MPI?

I’m on Windows if that matters.

Currently if I try to run, I will get this error:

mpiexec Solver2D.exe -ksp_type preonly -pc_type lu 
-pc_factor_mat_solver_package mkl_pardiso


[0]PETSC ERROR: - Error Message 



--

[0]PETSC ERROR: No support for this operation for this object type

[0]PETSC ERROR: Matrix format mpiaij does not have a solver package 
mkl_pardiso


for LU. Perhaps you must ./configure with --download-mkl_pardiso

[0]PETSC ERROR: See 
http://www.mcs.anl.gov/petsc/documentation/faq.html for trou


ble shooting.

[0]PETSC ERROR: Petsc Release Version 3.5.2, Sep, 08, 2014

Regards,

Vladimir

Re: [petsc-users] Query regarding accessing data in a Petsc Vec object

2014-09-28 Thread Patrick Sanan

Note that PETSc is designed to hide the details of the implementations 
from you (It's in C, but uses the techniques of Object-Oriented 
programming that you might be more familiar with in C++), so normal use 
would never require the user to know anything about the way PETSc 
objects represent data. In cases cases where access to lower-level 
representations of the data is useful, there is an interface provided to 
do this, as with Vec[Get/Set]Array (see the tutorials included in the 
src/ tree for many examples - if you look at the bottom of the link 
Anton provided, you'll see a handy list of links to ones that use that 
particular function).


The definitions of the functions in VecOps will depend on the 
implementation of Vec being used, and this is in general only known at 
runtime (which is a great advantage, as you can use command line 
arguments to change which implementation is used without recompiling 
anything). The various function definitions should be found in 
src/vec/vec/impls and subdirectories.


So, while it is possible to access the data stored in an object and 
print it and so on,  you'd have to include a private header and it'd be 
a bit ugly.  Perhaps a better way to poke around is to use a debugger 
like gdb or lldb (or something else built into your IDE if you're using 
one) - that should be quicker, and not require to you to spend time 
recompiling things.





On 9/28/14 1:57 PM, Parvathi M.K wrote:
I want to access the data stored in the vec object directly without 
using any functions. Is this possible?


I tried using something like p-hdr._  in a printf statement (where p 
is a vector). I wish to know which element of the _p_Vec structure 
stores the data.


On Sun, Sep 28, 2014 at 4:45 PM, Parvathi M.K phoenixaur...@gmail.com 
mailto:phoenixaur...@gmail.com wrote:


Hey,
I am fairly new to PetSc. I was just curious as to how you can
directly access the data stored in a Petsc object, say a vector
without using VecGetValues().

I've gone through the header files, petscimpl.h and vecimpl.h
which contain the definitions of the PetscObject and Vec
structures. I cannot figure out which member of the struct holds
the Vector data entered using VecSet(). Also, I could not find the
definitions of the functions in VecOps. Could someone tell me
where I can find these function definitions?

Thank you :)

Parvathi

Re: [petsc-users] Using the PCASM interface to define minimally overlapping subdomains

2014-09-17 Thread Patrick Sanan


On 9/16/14 9:43 PM, Barry Smith wrote:

On Sep 16, 2014, at 2:29 PM, Matthew Knepley knep...@gmail.com wrote:


On Tue, Sep 16, 2014 at 2:23 PM, Barry Smith bsm...@mcs.anl.gov wrote:

Patrick,

  This local part of the subdomains for this processor” term in  
PCASMSetLocalSubdomains is, IMHO, extremely confusing. WTHWTS? Anyways, I think that 
if you set the is_local[] to be different than the is[] you will always end up with 
a nonsymetric preconditioner. I think for one dimension you need to use

No I don't think that is right. The problem below is that you have overlap in 
only one direction. Process 0 overlaps
Process 1, but Process 1 has no overlap of Process 0. This is not how Schwarz 
is generally envisioned.

   Sure it is.

Imagine the linear algebra viewpoint, which I think is cleaner here. You 
partition the matrix rows into non-overlapping
sets. These sets are is_local[]. Then any information you get from another 
domain is another row, which is put into
is[]. You can certainly have a non-symmetric overlap, which you have below, but 
it mean one way information
transmission which is strange for convergence.

   No, not a all.


|0  1  2  3 4  5 6|

   Domain 0 is the region from  |   to  4  with Dirichlet boundary conditions 
at each end (| and 4). Domain 1 is from 2 to | with Dirichlet boundary 
conditions at each end (2 and |) .

   If you look at the PCSetUp_ASM() and PCApply_ASM() you’ll see all kinds of 
VecScatter creations from the various is and is_local, “restriction”, 
“prolongation” and “localization” then in the apply the different scatters are 
applied in the two directions, which results in a non-symmetric operator.


I was able to get my uniprocessor example to give the (symmetric) 
preconditioner I expected by commenting out the check in PCSetUp_ASM 
(line 311 in asm.c) and using PCASMSetLocalSubdomains with the same 
(overlapping) IS's for both is and is_local ([0 1 2 3] and [3 4 5 6] in 
the example above). It also works passing NULL for is_local.


I assume that the purpose of the check mentioned above is to ensure that 
every grid point is assigned to exactly one processor, which is needed 
by whatever interprocess scattering goes on in the implementation. Also, 
I assume that augmenting the domain definition with an explicit 
specification of the way domains are distributed over processes allows 
for more controllable use of PC_ASM_RESTRICT, with all its attractive 
properties.


Anyhow, Barry's advice previously in this thread works locally (for one 
test case) if you remove the check above, but the current implementation 
enforces something related to what Matt describes, which might be overly 
restrictive if multiple domains share a process.  The impression I got 
initially from the documentation was that if one uses PC_ASM_BASIC, the 
choice of is_local should only influence the details of the 
communication pattern, not (in exact arithmetic, with 
process-count-independent subsolves) the preconditioner being defined.


For regular grids this all seems pretty pathological (in practice I 
imagine people want to use symmetric overlaps, and I assume that one 
domain per node is the most common use case), but I could imagine it 
being more of a real concern when working with unstructured grids.




Barry




   Matt
  


is[0] -- 0 1 2 3
is[1] -- 3 4 5 6
is_local[0] -- 0 1 2 3
is_local[1] -- 3 4 5 6

Or you can pass NULL for is_local use PCASMSetOverlap(pc,0);

   Barry


Note that is_local[] doesn’t have to be non-overlapping or anything.


On Sep 16, 2014, at 10:48 AM, Patrick Sanan patrick.sa...@gmail.com wrote:


For the purposes of reproducing an example from a paper, I'd like to use PCASM 
with subdomains which 'overlap minimally' (though this is probably never a good 
idea in practice).

In one dimension with 7 unknowns and 2 domains, this might look like

0  1  2  3  4  5  6  (unknowns)
  (first subdomain  : 0 .. 3)
 ---  (second subdomain : 3 .. 6)

The subdomains share only a single grid point, which differs from the way PCASM 
is used in most of the examples.

In two dimensions, minimally overlapping rectangular subdomains would overlap 
one exactly one row or column of the grid. Thus, for example, if the grid 
unknowns were

0  1  2  3  4  5  |
6  7  8  9  10 11 | |
12 13 14 15 16 17   |
 
---

then one minimally-overlapping set of 4 subdomains would be
0 1 2 3 6 7 8 9
3 4 5 9 10 11
6 7 8 9 12 13 14 15
9 10 11 15 16 17
as suggested by the dashes and pipes above. The subdomains only overlap by a 
single row or column of the grid.

My question is whether and how one can use the PCASM interface to work with 
these sorts of decompositions (It's fine for my purposes to use a single MPI 
process). In particular, I don't quite understand if should be possible to 
define these decompositions by correctly providing is and is_local arguments

[petsc-users] Using the PCASM interface to define minimally overlapping subdomains

2014-09-16 Thread Patrick Sanan

For the purposes of reproducing an example from a paper, I'd like to use 
PCASM with subdomains which 'overlap minimally' (though this is probably 
never a good idea in practice).


In one dimension with 7 unknowns and 2 domains, this might look like

 0  1  2  3  4  5  6  (unknowns)
  (first subdomain  : 0 .. 3)
 ---  (second subdomain : 3 .. 6)

The subdomains share only a single grid point, which differs from the 
way PCASM is used in most of the examples.


In two dimensions, minimally overlapping rectangular subdomains would 
overlap one exactly one row or column of the grid. Thus, for example, if 
the grid unknowns were


0  1  2  3  4  5  |
6  7  8  9  10 11 | |
12 13 14 15 16 17   |
 
---

then one minimally-overlapping set of 4 subdomains would be
0 1 2 3 6 7 8 9
3 4 5 9 10 11
6 7 8 9 12 13 14 15
9 10 11 15 16 17
as suggested by the dashes and pipes above. The subdomains only overlap 
by a single row or column of the grid.


My question is whether and how one can use the PCASM interface to work 
with these sorts of decompositions (It's fine for my purposes to use a 
single MPI process). In particular, I don't quite understand if should 
be possible to define these decompositions by correctly providing is and 
is_local arguments to PCASMSetLocalSubdomains.


I have gotten code to run defining the is_local entries to be subsets of 
the is entries which define a partition of the global degrees of 
freedom*, but I'm not certain that this was the correct choice, as it 
appears to produce an unsymmetric preconditioner for a symmetric system 
when I use direct subdomain solves and the 'basic' type for PCASM.


* For example, in the 1D example above this would correspond to
is[0] -- 0 1 2 3
is[1] -- 3 4 5 6
is_local[0] -- 0 1 2
is_local[1] -- 3 4 5 6

Re: [petsc-users] MATLAB viewer

2014-07-27 Thread Patrick Sanan

A simple way to proceed is to use 
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Viewer/PetscViewerBinaryOpen.html

On Jul 27, 2014, at 10:18 AM, Sun, Hui hus...@ucsd.edu wrote:

 I try to create some binary output for MATLAB with the following code. 
 
 PetscViewer viewer = PETSC_VIEWER_BINARY(PETSC_COMM_WORLD);
 MatView(A,viewer);
 VecView(v,viewer);
 PetscViewerDestroy(viewer);
 
 However it gives me an error, which is the following:
 
 interface_fs.c:215:26: warning: implicit declaration of function 
 'PETSC_VIEWER_MATLAB' is invalid in C99 [-Wimplicit-function-declaration]
PetscViewer viewer = PETSC_VIEWER_MATLAB(PETSC_COMM_WORLD);
 ^
 interface_fs.c:215:17: warning: incompatible integer to pointer conversion 
 initializing 'PetscViewer' (aka 'struct _p_PetscViewer *') with an expression 
 of type 'int'
  [-Wint-conversion]
PetscViewer viewer = PETSC_VIEWER_MATLAB(PETSC_COMM_WORLD);
 
 
 I'm wondering what is the correct way to do this? Thank you!

Re: [petsc-users] SuperLU options database

2013-07-15 Thread Patrick Sanan

More things you can do to avoid typing the arguments over and over are:
 - Use a file to specify options (PetscInitialize takes an argument for this 
file, which by default is ~/.petscrc ) 
- Set runtime options from within your code using PetscOptionsSetValue 
- Write a own shell script or makefile with a special target to run the 
executable with the options you'd like appended

On Jul 15, 2013, at 3:15 PM, Hong Zhang wrote:

 Su :
 We provide these via runtime options. With the latest petsc release (v3.4),
 you can get
 petsc/src/ksp/ksp/examples/tutorials/ex2.c: 
 ./ex2 -pc_type ilu -pc_factor_mat_solver_package superlu -hels |grep superlu
 ...
   -mat_superlu_ilu_droptol 0.0001: ILU_DropTol (None)
   -mat_superlu_ilu_filltol 0.01: ILU_FillTol (None)
   -mat_superlu_ilu_fillfactor 10: ILU_FillFactor (None)
   -mat_superlu_ilu_droprull 9: ILU_DropRule (None)
   -mat_superlu_ilu_norm 2: ILU_Norm (None)
   -mat_superlu_ilu_milu 0: ILU_MILU (None)
 
 i.e., you can set/change your own options at runtime.
 
 Hong
 
 Hi, I am trying to use the external package SuperLU to solve a linear 
 equation with ILUTP preconditioner. But the current PETSc only provides one 
 option interface which is MatSuperluSetILUDropTol. There are actually many 
 other options as suggested on the manual page of MATSOLVERSUPERLU, such as:
 
 -mat_superlu_equil
 -mat_superlu_ilu_filltol
 -mat_superlu_ilu_fillfactor
 
 etc. However, they all need to be invoked via command line.
 
 Is there any way to set up other options for SuperLU in the code instead of 
 using the command line? Because it's not so convenient to type in all those 
 runtime commands every time you run a code. 
 
 Thanks a lot.
 
 Su

[petsc-users] Matrices - Arrays

2013-05-14 Thread Patrick Sanan

On May 14, 2013, at 2:15 PM, Matthew Knepley wrote:

 On Tue, May 14, 2013 at 4:06 PM, Anthony Vergottis a.vergottis at gmail.com 
 wrote:
 To whom it may concern,
 
 I am very new to Petsc and I would very much appreciate some assistance.
 
 What would be the best way to create an array of matrix objects? The number 
 of matrix objects required would be determined during run time.
 
 For example, I am looking into the FEM and I would like to create a matrix 
 for each element (I know how to set up the matrix objects). Therefore, if I 
 am using a mesh with 100 element I would like to determine during runtime 
 that I will need to create 100 MatA[100] etc, something along those 
 lines. Hope I explained it well.
 
 Sorry if this is a trivial question but your help would go a long way.
 
 I don't think you want to do this. Generally, element matrices can be raw 
 arrays. If you really want
 an array of Mat objects, declare it
 
   Mat *mats;
 
 Allocate it
 
   ierr = PetscMalloc(numMats * sizeof(Mat), mats);CHKERRQ(ierr);
 
 and call MatCreate(), etc. for each entry.
Another alternative is to use a lightweight vector/matrix library for small 
local matrices. Since you're using C++, an easy-to-use option might be Eigen. 

-- next part --
An HTML attachment was scrubbed...
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20130514/774fb09d/attachment.html

1 2 >

100 matches

Mail list logo