Re: [petsc-dev] Proposed changes to TS API

2018-05-12 Thread Matthew Knepley
On Fri, May 11, 2018 at 7:05 PM, Jed Brown  wrote:

> "Smith, Barry F."  writes:
>
> >Here is MY summary of the discussion so far.
> >
> > 1) the IFunction/IJacobian interface has its supporters. There is valid
> argument that for certain cases it can be more efficient than the proposed
> alternative; but this seems largely a theoretical believe at this time
> since there are no comparisons between nontrivial (or even trivial) codes
> that use the assembly directly the mass shift plus Jacobian vs the assembly
> separately and MatAXPY the two parts together.  As far as I can see this
> potential performance is the ONLY benefit to keeping the
> IFunction/IJacobian() interface? Please list additional benefits if there
> are any?
>
> PCShell
>
> > 2) The IFunction/IJacobian approach makes it impossible for TS to access
> the mass matrix alone.
>
> Without wasted work, yes.
>
> > 3) But one can access the IJacobian matrix alone by passing a shift of
> zero
> >
> > 4) TSComputeIJacobian() is an ugly piece of shit code that has a
> confusing name since it also can incorporates the RHS Jacobian.
>
> If you get rid of that, then every implicit integrator will need to
> handle the RHSFunction/RHSJacobian itself.  It will be significantly
> more code to maintain.
>
> > 4a) the TSComputeIJacobian() has issues for linear problems because it
> overwrites the user provided Jacobian and has hacks to deal with it
>
> Yes, however that choice reduces memory usage which is sometimes an
> important factor.
>
> > 5) There is no standard way to solve M u = F() explicitly from the
> IFunction() form, (and cannot even with expressed with the explicit RHS
> approach) the user must manage the M solve themselves inside their
> RHSFunction.
>
> We could write this as an IMEX method with IFunction providing M u and
> RHSFunction providing F.  I think this could be specialized for constant
> M and provided automatically for any explicit method.


We need this for PyLith, and are doing what you suggest by hand right now.
Automating it would
go a long way toward removing objection to the current division I think.

  Matt


>
> > 6) There is some support for adding two new function callbacks that a)
> compute the mass matrix and b) compute the Jacobian part of the IJacobian.
> This appears particularly useful for implementing adjoints.
> >
> > 7) If we added the two new interfaces the internals of an already
> > overly complicated TS become even more convoluted and unmaintainable.
>
> We could split the shift into ashift and bshift (elided as always 1.0
> now), then users could opt to skip work if one of those was nonzero.
> That would be a modest generalization from what we have now and would
> enable any desired optimization.  Integrators that wish to take
> advantage of M not changing could interpret a flag saying that it was
> constant and then always call the IJacobian with ashift=0.  That would
> be unintrusive for existing users and still enable all optimizations.
> It's only one additional parameter and then optimized user code could
> look like
>
>   for (each element) {
> Ke[m][m] = {};
> if (ashift != 0) {
>   Ke += ashift * (mass stuff);
> }
> if (bshift != 0) {
>   Ke += bshift * (non-mass stuff);
> }
>   }
>
> Naive user code would elide the conditionals, but would still work with
> all integrators.
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-dev] Proposed changes to TS API

2018-05-12 Thread Smith, Barry F.


> On May 12, 2018, at 1:17 PM, Jed Brown  wrote:
> 
> "Smith, Barry F."  writes:
> 
>>> On May 11, 2018, at 6:05 PM, Jed Brown  wrote:
>>> 
>>> "Smith, Barry F."  writes:
>>> 
  Here is MY summary of the discussion so far.
 
 1) the IFunction/IJacobian interface has its supporters. There is valid 
 argument that for certain cases it can be more efficient than the proposed 
 alternative; but this seems largely a theoretical believe at this time 
 since there are no comparisons between nontrivial (or even trivial) codes 
 that use the assembly directly the mass shift plus Jacobian vs the 
 assembly separately and MatAXPY the two parts together.  As far as I can 
 see this potential performance is the ONLY benefit to keeping the 
 IFunction/IJacobian() interface? Please list additional benefits if there 
 are any?
>>> 
>>> PCShell
>>> 
 2) The IFunction/IJacobian approach makes it impossible for TS to access 
 the mass matrix alone. 
>>> 
>>> Without wasted work, yes.
>>> 
 3) But one can access the IJacobian matrix alone by passing a shift of zero
 
 4) TSComputeIJacobian() is an ugly piece of shit code that has a confusing 
 name since it also can incorporates the RHS Jacobian. 
>>> 
>>> If you get rid of that, then every implicit integrator will need to
>>> handle the RHSFunction/RHSJacobian itself.  It will be significantly
>>> more code to maintain.
>>> 
 4a) the TSComputeIJacobian() has issues for linear problems because it 
 overwrites the user provided Jacobian and has hacks to deal with it
>>> 
>>> Yes, however that choice reduces memory usage which is sometimes an
>>> important factor.
>>> 
 5) There is no standard way to solve M u = F() explicitly from the 
 IFunction() form, (and cannot even with expressed with the explicit RHS 
 approach) the user must manage the M solve themselves inside their 
 RHSFunction.
>>> 
>>> We could write this as an IMEX method with IFunction providing M u and
>>> RHSFunction providing F.  I think this could be specialized for constant
>>> M and provided automatically for any explicit method.
>>> 
 6) There is some support for adding two new function callbacks that a) 
 compute the mass matrix and b) compute the Jacobian part of the IJacobian. 
 This appears particularly useful for implementing adjoints.
 
 7) If we added the two new interfaces the internals of an already
 overly complicated TS become even more convoluted and unmaintainable.  
>>> 
>>> We could split the shift into ashift and bshift (elided as always 1.0
>>> now), then users could opt to skip work if one of those was nonzero.
>>> That would be a modest generalization from what we have now and would
>>> enable any desired optimization.  Integrators that wish to take
>>> advantage of M not changing could interpret a flag saying that it was
>>> constant and then always call the IJacobian with ashift=0.  That would
>>> be unintrusive for existing users and still enable all optimizations.
>>> It's only one additional parameter and then optimized user code could
>>> look like
>>> 
>>> for (each element) {
>>>   Ke[m][m] = {};
>>>   if (ashift != 0) {
>>> Ke += ashift * (mass stuff);
>>>   }
>>>   if (bshift != 0) {
>>> Ke += bshift * (non-mass stuff);
>>>   }
>>> }
>>> 
>>> Naive user code would elide the conditionals, but would still work with
>>> all integrators.
>> 
>>   So this is a backdoor way of allowing the user to provide separate 
>> functions for computing the mass matrix and the Jacobian but the current 
>> TSSetIJacobian() function only takes one set of matrices in which means that 
>> if it returns just the mass matrix it returns it with the storage pattern of 
>> the Jacobian, so if the mass matrix is diagonal you get back a sparse matrix 
>> with many explicit nonzeros and just the entries on the diagonal. 
> 
> Okay, that's messy to work around and relevant to methods like compact
> FD where the mass matrix is significantly sparser than the RHS operator.
> 
>>   How about the following based on suggestions from several people.
>> 
>> Keep the current public API TSSetIJacobian() as is
>> 
>>  Add two new public functions:TSSetMass(Mat, functionmass),  
>> TSSetIJacobianWithOutMass(Mat, functionwithoutmass) (bad name).
> 
> What if, when Mass is set, the existing IJacobian is called with shift=0
> by any consumers that desire the parts separately?  Then we wouldn't
> have to name TSSetIJacobianWithoutMass.  

   
> There could be a "don't ever
> call me with nonzero shift" option if the IJacobian implementation was
> incapable of assembling that part and didn't want to call
> 
>  TSGetMassMatrix(ts,);// honors a reuse policy
>  MatAXPY(J,shift,mass,user_knows_best_what_flag_to_use);

 Hmm, cutting back the API by one SetFunction but then needing another API to 
explain corner 

Re: [petsc-dev] Proposed changes to TS API

2018-05-12 Thread Jed Brown
"Smith, Barry F."  writes:

>> On May 11, 2018, at 6:05 PM, Jed Brown  wrote:
>> 
>> "Smith, Barry F."  writes:
>> 
>>>   Here is MY summary of the discussion so far.
>>> 
>>> 1) the IFunction/IJacobian interface has its supporters. There is valid 
>>> argument that for certain cases it can be more efficient than the proposed 
>>> alternative; but this seems largely a theoretical believe at this time 
>>> since there are no comparisons between nontrivial (or even trivial) codes 
>>> that use the assembly directly the mass shift plus Jacobian vs the assembly 
>>> separately and MatAXPY the two parts together.  As far as I can see this 
>>> potential performance is the ONLY benefit to keeping the 
>>> IFunction/IJacobian() interface? Please list additional benefits if there 
>>> are any?
>> 
>> PCShell
>> 
>>> 2) The IFunction/IJacobian approach makes it impossible for TS to access 
>>> the mass matrix alone. 
>> 
>> Without wasted work, yes.
>> 
>>> 3) But one can access the IJacobian matrix alone by passing a shift of zero
>>> 
>>> 4) TSComputeIJacobian() is an ugly piece of shit code that has a confusing 
>>> name since it also can incorporates the RHS Jacobian. 
>> 
>> If you get rid of that, then every implicit integrator will need to
>> handle the RHSFunction/RHSJacobian itself.  It will be significantly
>> more code to maintain.
>> 
>>> 4a) the TSComputeIJacobian() has issues for linear problems because it 
>>> overwrites the user provided Jacobian and has hacks to deal with it
>> 
>> Yes, however that choice reduces memory usage which is sometimes an
>> important factor.
>> 
>>> 5) There is no standard way to solve M u = F() explicitly from the 
>>> IFunction() form, (and cannot even with expressed with the explicit RHS 
>>> approach) the user must manage the M solve themselves inside their 
>>> RHSFunction.
>> 
>> We could write this as an IMEX method with IFunction providing M u and
>> RHSFunction providing F.  I think this could be specialized for constant
>> M and provided automatically for any explicit method.
>> 
>>> 6) There is some support for adding two new function callbacks that a) 
>>> compute the mass matrix and b) compute the Jacobian part of the IJacobian. 
>>> This appears particularly useful for implementing adjoints.
>>> 
>>> 7) If we added the two new interfaces the internals of an already
>>> overly complicated TS become even more convoluted and unmaintainable.  
>> 
>> We could split the shift into ashift and bshift (elided as always 1.0
>> now), then users could opt to skip work if one of those was nonzero.
>> That would be a modest generalization from what we have now and would
>> enable any desired optimization.  Integrators that wish to take
>> advantage of M not changing could interpret a flag saying that it was
>> constant and then always call the IJacobian with ashift=0.  That would
>> be unintrusive for existing users and still enable all optimizations.
>> It's only one additional parameter and then optimized user code could
>> look like
>> 
>>  for (each element) {
>>Ke[m][m] = {};
>>if (ashift != 0) {
>>  Ke += ashift * (mass stuff);
>>}
>>if (bshift != 0) {
>>  Ke += bshift * (non-mass stuff);
>>}
>>  }
>> 
>> Naive user code would elide the conditionals, but would still work with
>> all integrators.
>
>So this is a backdoor way of allowing the user to provide separate 
> functions for computing the mass matrix and the Jacobian but the current 
> TSSetIJacobian() function only takes one set of matrices in which means that 
> if it returns just the mass matrix it returns it with the storage pattern of 
> the Jacobian, so if the mass matrix is diagonal you get back a sparse matrix 
> with many explicit nonzeros and just the entries on the diagonal. 

Okay, that's messy to work around and relevant to methods like compact
FD where the mass matrix is significantly sparser than the RHS operator.

>How about the following based on suggestions from several people.
>
>  Keep the current public API TSSetIJacobian() as is
>
>   Add two new public functions:TSSetMass(Mat, functionmass),  
> TSSetIJacobianWithOutMass(Mat, functionwithoutmass) (bad name).

What if, when Mass is set, the existing IJacobian is called with shift=0
by any consumers that desire the parts separately?  Then we wouldn't
have to name TSSetIJacobianWithoutMass.  There could be a "don't ever
call me with nonzero shift" option if the IJacobian implementation was
incapable of assembling that part and didn't want to call

  TSGetMassMatrix(ts,);// honors a reuse policy
  MatAXPY(J,shift,mass,user_knows_best_what_flag_to_use);

>   Provide developer functionsTSComputeMass() which uses functionmass when 
> possible else ijacobian, TSComputeIJacobianWithoutMass() that computes using 
> functionwithoutmass if possible otherwise ijacobian. And change 
> TSComputeIJacobian() so it uses i jacobian if it exists 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 6:05 PM, Jed Brown  wrote:
> 
> "Smith, Barry F."  writes:
> 
>>   Here is MY summary of the discussion so far.
>> 
>> 1) the IFunction/IJacobian interface has its supporters. There is valid 
>> argument that for certain cases it can be more efficient than the proposed 
>> alternative; but this seems largely a theoretical believe at this time since 
>> there are no comparisons between nontrivial (or even trivial) codes that use 
>> the assembly directly the mass shift plus Jacobian vs the assembly 
>> separately and MatAXPY the two parts together.  As far as I can see this 
>> potential performance is the ONLY benefit to keeping the 
>> IFunction/IJacobian() interface? Please list additional benefits if there 
>> are any?
> 
> PCShell
> 
>> 2) The IFunction/IJacobian approach makes it impossible for TS to access the 
>> mass matrix alone. 
> 
> Without wasted work, yes.
> 
>> 3) But one can access the IJacobian matrix alone by passing a shift of zero
>> 
>> 4) TSComputeIJacobian() is an ugly piece of shit code that has a confusing 
>> name since it also can incorporates the RHS Jacobian. 
> 
> If you get rid of that, then every implicit integrator will need to
> handle the RHSFunction/RHSJacobian itself.  It will be significantly
> more code to maintain.
> 
>> 4a) the TSComputeIJacobian() has issues for linear problems because it 
>> overwrites the user provided Jacobian and has hacks to deal with it
> 
> Yes, however that choice reduces memory usage which is sometimes an
> important factor.
> 
>> 5) There is no standard way to solve M u = F() explicitly from the 
>> IFunction() form, (and cannot even with expressed with the explicit RHS 
>> approach) the user must manage the M solve themselves inside their 
>> RHSFunction.
> 
> We could write this as an IMEX method with IFunction providing M u and
> RHSFunction providing F.  I think this could be specialized for constant
> M and provided automatically for any explicit method.
> 
>> 6) There is some support for adding two new function callbacks that a) 
>> compute the mass matrix and b) compute the Jacobian part of the IJacobian. 
>> This appears particularly useful for implementing adjoints.
>> 
>> 7) If we added the two new interfaces the internals of an already
>> overly complicated TS become even more convoluted and unmaintainable.  
> 
> We could split the shift into ashift and bshift (elided as always 1.0
> now), then users could opt to skip work if one of those was nonzero.
> That would be a modest generalization from what we have now and would
> enable any desired optimization.  Integrators that wish to take
> advantage of M not changing could interpret a flag saying that it was
> constant and then always call the IJacobian with ashift=0.  That would
> be unintrusive for existing users and still enable all optimizations.
> It's only one additional parameter and then optimized user code could
> look like
> 
>  for (each element) {
>Ke[m][m] = {};
>if (ashift != 0) {
>  Ke += ashift * (mass stuff);
>}
>if (bshift != 0) {
>  Ke += bshift * (non-mass stuff);
>}
>  }
> 
> Naive user code would elide the conditionals, but would still work with
> all integrators.

   So this is a backdoor way of allowing the user to provide separate functions 
for computing the mass matrix and the Jacobian but the current TSSetIJacobian() 
function only takes one set of matrices in which means that if it returns just 
the mass matrix it returns it with the storage pattern of the Jacobian, so if 
the mass matrix is diagonal you get back a sparse matrix with many explicit 
nonzeros and just the entries on the diagonal. 

   How about the following based on suggestions from several people.

 Keep the current public API TSSetIJacobian() as is

  Add two new public functions:TSSetMass(Mat, functionmass),  
TSSetIJacobianWithOutMass(Mat, functionwithoutmass) (bad name).

  Provide developer functionsTSComputeMass() which uses functionmass when 
possible else ijacobian, TSComputeIJacobianWithoutMass() that computes using 
functionwithoutmass if possible otherwise ijacobian. And change 
TSComputeIJacobian() so it uses i jacobian if it exists otherwise uses 
functionmass and functionwithoutmass and combines them automatically. 

  Drawbacks include more complicated internal code (though with the helper 
functions it will be cleaner especially in the adjoints), power users can 
provide combined operation, non power users have a simpler interface. 







Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Jed Brown
"Smith, Barry F."  writes:

>> On May 11, 2018, at 6:05 PM, Jed Brown  wrote:
>> 
>> "Smith, Barry F."  writes:
>> 
>>>   Here is MY summary of the discussion so far.
>>> 
>>> 1) the IFunction/IJacobian interface has its supporters. There is valid 
>>> argument that for certain cases it can be more efficient than the proposed 
>>> alternative; but this seems largely a theoretical believe at this time 
>>> since there are no comparisons between nontrivial (or even trivial) codes 
>>> that use the assembly directly the mass shift plus Jacobian vs the assembly 
>>> separately and MatAXPY the two parts together.  As far as I can see this 
>>> potential performance is the ONLY benefit to keeping the 
>>> IFunction/IJacobian() interface? Please list additional benefits if there 
>>> are any?
>> 
>> PCShell
>
>Please explain what this means, I have no clue.

If you are using matrix-free preconditioning, SNESGS, and anything else
that really looks into the problem structure, then the
IFunction/IJacobian interface is much cleaner than having separate
functions for mass and non-mass parts and needing the solver to learn
about it by interrogating a different data structure like MatShell.

>>> 2) The IFunction/IJacobian approach makes it impossible for TS to access 
>>> the mass matrix alone. 
>> 
>> Without wasted work, yes.
>
>   The computation of two Jacobians correct?

Yeah, though it could be cached if that was a performance bottleneck.

>>> 3) But one can access the IJacobian matrix alone by passing a shift of zero
>>> 
>>> 4) TSComputeIJacobian() is an ugly piece of shit code that has a confusing 
>>> name since it also can incorporates the RHS Jacobian. 
>> 
>> If you get rid of that, then every implicit integrator will need to
>> handle the RHSFunction/RHSJacobian itself.  It will be significantly
>> more code to maintain.
>
>I don't propose "getting rid of it", I suggest perhaps the code could be 
> refactored (I don't know how) to have something more streamlined.

Okay.

>>> 4a) the TSComputeIJacobian() has issues for linear problems because it 
>>> overwrites the user provided Jacobian and has hacks to deal with it
>> 
>> Yes, however that choice reduces memory usage which is sometimes an
>> important factor.
>> 
>>> 5) There is no standard way to solve M u = F() explicitly from the 
>>> IFunction() form, (and cannot even with expressed with the explicit RHS 
>>> approach) the user must manage the M solve themselves inside their 
>>> RHSFunction.
>> 
>> We could write this as an IMEX method with IFunction providing M u and
>> RHSFunction providing F.  I think this could be specialized for constant
>> M and provided automatically for any explicit method.
>> 
>>> 6) There is some support for adding two new function callbacks that a) 
>>> compute the mass matrix and b) compute the Jacobian part of the IJacobian. 
>>> This appears particularly useful for implementing adjoints.
>>> 
>>> 7) If we added the two new interfaces the internals of an already
>>> overly complicated TS become even more convoluted and unmaintainable.  
>> 
>> We could split the shift into ashift and bshift (elided as always 1.0
>> now), then users could opt to skip work if one of those was nonzero.
>> That would be a modest generalization from what we have now and would
>> enable any desired optimization.  Integrators that wish to take
>> advantage of M not changing could interpret a flag saying that it was
>> constant and then always call the IJacobian with ashift=0.  That would
>> be unintrusive for existing users and still enable all optimizations.
>> It's only one additional parameter and then optimized user code could
>> look like
>> 
>>  for (each element) {
>>Ke[m][m] = {};
>>if (ashift != 0) {
>>  Ke += ashift * (mass stuff);
>>}
>>if (bshift != 0) {
>>  Ke += bshift * (non-mass stuff);
>>}
>>  }
>> 
>> Naive user code would elide the conditionals, but would still work with
>> all integrators.
>
>Then also provide new TS developer routines such as
>TSComputeMass(), TSComputeJacobianWithoutMass() (bad name) so that
>TS code would look cleaner and not have a bunch of ugly calls to
>TSComputeIJacobian with shifts of 1 and 0 around that I suspect
>Hong doesn't like.

Either way; I don't care.  Some developers use VecSet(X,0) instead of
VecZeroEntries and there are many calls to VecSet(X,1.0) in PETSc but no
dedicated interface.


Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 6:05 PM, Jed Brown  wrote:
> 
> "Smith, Barry F."  writes:
> 
>>   Here is MY summary of the discussion so far.
>> 
>> 1) the IFunction/IJacobian interface has its supporters. There is valid 
>> argument that for certain cases it can be more efficient than the proposed 
>> alternative; but this seems largely a theoretical believe at this time since 
>> there are no comparisons between nontrivial (or even trivial) codes that use 
>> the assembly directly the mass shift plus Jacobian vs the assembly 
>> separately and MatAXPY the two parts together.  As far as I can see this 
>> potential performance is the ONLY benefit to keeping the 
>> IFunction/IJacobian() interface? Please list additional benefits if there 
>> are any?
> 
> PCShell

   Please explain what this means, I have no clue.

> 
>> 2) The IFunction/IJacobian approach makes it impossible for TS to access the 
>> mass matrix alone. 
> 
> Without wasted work, yes.

  The computation of two Jacobians correct?
> 
>> 3) But one can access the IJacobian matrix alone by passing a shift of zero
>> 
>> 4) TSComputeIJacobian() is an ugly piece of shit code that has a confusing 
>> name since it also can incorporates the RHS Jacobian. 
> 
> If you get rid of that, then every implicit integrator will need to
> handle the RHSFunction/RHSJacobian itself.  It will be significantly
> more code to maintain.

   I don't propose "getting rid of it", I suggest perhaps the code could be 
refactored (I don't know how) to have something more streamlined.

> 
>> 4a) the TSComputeIJacobian() has issues for linear problems because it 
>> overwrites the user provided Jacobian and has hacks to deal with it
> 
> Yes, however that choice reduces memory usage which is sometimes an
> important factor.
> 
>> 5) There is no standard way to solve M u = F() explicitly from the 
>> IFunction() form, (and cannot even with expressed with the explicit RHS 
>> approach) the user must manage the M solve themselves inside their 
>> RHSFunction.
> 
> We could write this as an IMEX method with IFunction providing M u and
> RHSFunction providing F.  I think this could be specialized for constant
> M and provided automatically for any explicit method.
> 
>> 6) There is some support for adding two new function callbacks that a) 
>> compute the mass matrix and b) compute the Jacobian part of the IJacobian. 
>> This appears particularly useful for implementing adjoints.
>> 
>> 7) If we added the two new interfaces the internals of an already
>> overly complicated TS become even more convoluted and unmaintainable.  
> 
> We could split the shift into ashift and bshift (elided as always 1.0
> now), then users could opt to skip work if one of those was nonzero.
> That would be a modest generalization from what we have now and would
> enable any desired optimization.  Integrators that wish to take
> advantage of M not changing could interpret a flag saying that it was
> constant and then always call the IJacobian with ashift=0.  That would
> be unintrusive for existing users and still enable all optimizations.
> It's only one additional parameter and then optimized user code could
> look like
> 
>  for (each element) {
>Ke[m][m] = {};
>if (ashift != 0) {
>  Ke += ashift * (mass stuff);
>}
>if (bshift != 0) {
>  Ke += bshift * (non-mass stuff);
>}
>  }
> 
> Naive user code would elide the conditionals, but would still work with
> all integrators.

   Then also provide new TS developer routines such as TSComputeMass(), 
TSComputeJacobianWithoutMass() (bad name) so that TS code would look cleaner 
and not have a bunch of ugly calls to TSComputeIJacobian with shifts of 1 and 0 
around that I suspect Hong doesn't like.






Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Jed Brown
"Smith, Barry F."  writes:

>Here is MY summary of the discussion so far.
>
> 1) the IFunction/IJacobian interface has its supporters. There is valid 
> argument that for certain cases it can be more efficient than the proposed 
> alternative; but this seems largely a theoretical believe at this time since 
> there are no comparisons between nontrivial (or even trivial) codes that use 
> the assembly directly the mass shift plus Jacobian vs the assembly separately 
> and MatAXPY the two parts together.  As far as I can see this potential 
> performance is the ONLY benefit to keeping the IFunction/IJacobian() 
> interface? Please list additional benefits if there are any?

PCShell

> 2) The IFunction/IJacobian approach makes it impossible for TS to access the 
> mass matrix alone. 

Without wasted work, yes.

> 3) But one can access the IJacobian matrix alone by passing a shift of zero
>
> 4) TSComputeIJacobian() is an ugly piece of shit code that has a confusing 
> name since it also can incorporates the RHS Jacobian. 

If you get rid of that, then every implicit integrator will need to
handle the RHSFunction/RHSJacobian itself.  It will be significantly
more code to maintain.

> 4a) the TSComputeIJacobian() has issues for linear problems because it 
> overwrites the user provided Jacobian and has hacks to deal with it

Yes, however that choice reduces memory usage which is sometimes an
important factor.

> 5) There is no standard way to solve M u = F() explicitly from the 
> IFunction() form, (and cannot even with expressed with the explicit RHS 
> approach) the user must manage the M solve themselves inside their 
> RHSFunction.

We could write this as an IMEX method with IFunction providing M u and
RHSFunction providing F.  I think this could be specialized for constant
M and provided automatically for any explicit method.

> 6) There is some support for adding two new function callbacks that a) 
> compute the mass matrix and b) compute the Jacobian part of the IJacobian. 
> This appears particularly useful for implementing adjoints.
>
> 7) If we added the two new interfaces the internals of an already
> overly complicated TS become even more convoluted and unmaintainable.  

We could split the shift into ashift and bshift (elided as always 1.0
now), then users could opt to skip work if one of those was nonzero.
That would be a modest generalization from what we have now and would
enable any desired optimization.  Integrators that wish to take
advantage of M not changing could interpret a flag saying that it was
constant and then always call the IJacobian with ashift=0.  That would
be unintrusive for existing users and still enable all optimizations.
It's only one additional parameter and then optimized user code could
look like

  for (each element) {
Ke[m][m] = {};
if (ashift != 0) {
  Ke += ashift * (mass stuff);
}
if (bshift != 0) {
  Ke += bshift * (non-mass stuff);
}
  }

Naive user code would elide the conditionals, but would still work with
all integrators.


Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Jed Brown
"Smith, Barry F."  writes:

>> On May 11, 2018, at 3:36 PM, Jed Brown  wrote:
>> 
>> "Zhang, Hong"  writes:
>> 
>>> We are not forcing users to do two matrix assemblies per time
>>> step. For most cases, there is even no need to update dF/dUdot at
>>> all. For extreme cases that the application requires frequent update
>>> on dF/dUdot and assembly is expensive, one can always assemble the
>>> single-matrix Jacobian and throw it to SNES directly.
>> 
>> For the vast majority of cases, the mass matrix terms are inexpensive
>> and can be summed during each assembly at negligible cost (cheaper than
>> accessing those terms from an already-assembled matrix).
>> 
>>> 
>>> TS used to be an unusable pile of crap until Jed proposed the marvelous
>>> IJacobian interface. Suddenly COMPLEX fully-implicit DAE problems become
>>> SIMPLE to express, and a single IFunction/IJacobian pair reusable for
>>> different timestepper implementations a reality. And we got an added
>>> bounus: this was efficient, it involved a SINGLE global matrix assembly. If
>>> the issue is in supporting simpler problems, then we should go for an
>>> alternative interface with broken Jacobians, just as Stefano propossed in a
>>> previous email. I'm totally in favor of an additional broken Jacobians API,
>>> and totally againt the removal of the single-matrix IJacobian interface.
>>> 
>>> The current IJacobian is essentially SNESJacobian. And the single-matrix 
>>> SNESJacobian interface is always there. Power users could set up the 
>>> SNESJacobian directly if we pass a read-only shift parameter to them. So we 
>>> are by no means prohibiting power users from doing what they want.
>> 
>> You mean the user would call TSGetSNES and SNESSetJacobian, then
>> internally call TSGetIJacobianShift and some new function to create
>> Udot?  That sounds way messier and error-prone.
>> 
>>> IJacobian with shift mixes TS Jacobian and SNES Jacobian. This is the issue 
>>> we need to fix.
>> 
>> It is just one interface producing exactly the matrix that the solver
>> needs.
>
> The implicit solver needs it, but the explicit solvers do not, nor do the 
> adjoint computations. This is the issue.

Hmm, but the time stepping uses the same matrix.  The mass matrix alone
is only used in the initial and final conditions for the adjoint system
(i.e., not performance sensitive).  What am I missing?



Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.

   Here is MY summary of the discussion so far.

1) the IFunction/IJacobian interface has its supporters. There is valid 
argument that for certain cases it can be more efficient than the proposed 
alternative; but this seems largely a theoretical believe at this time since 
there are no comparisons between nontrivial (or even trivial) codes that use 
the assembly directly the mass shift plus Jacobian vs the assembly separately 
and MatAXPY the two parts together.  As far as I can see this potential 
performance is the ONLY benefit to keeping the IFunction/IJacobian() interface? 
Please list additional benefits if there are any?

2) The IFunction/IJacobian approach makes it impossible for TS to access the 
mass matrix alone. 

3) But one can access the IJacobian matrix alone by passing a shift of zero

4) TSComputeIJacobian() is an ugly piece of shit code that has a confusing name 
since it also can incorporates the RHS Jacobian. 

4a) the TSComputeIJacobian() has issues for linear problems because it 
overwrites the user provided Jacobian and has hacks to deal with it

5) There is no standard way to solve M u = F() explicitly from the IFunction() 
form, (and cannot even with expressed with the explicit RHS approach) the user 
must manage the M solve themselves inside their RHSFunction.

6) There is some support for adding two new function callbacks that a) compute 
the mass matrix and b) compute the Jacobian part of the IJacobian. This appears 
particularly useful for implementing adjoints.

7) If we added the two new interfaces the internals of an already overly 
complicated TS become even more convoluted and unmaintainable.  For kicks I 
list the current TSComputeIJacobian() which I consider unmaintainable already.

  if (ijacobian) {
PetscBool missing;
PetscStackPush("TS user implicit Jacobian");
ierr = (*ijacobian)(ts,t,U,Udot,shift,A,B,ctx);CHKERRQ(ierr);
PetscStackPop;
ierr = MatMissingDiagonal(A,,NULL);CHKERRQ(ierr);
if (missing) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"Amat passed 
to TSSetIJacobian() must have all diagonal entries set, if they are zero you 
must still set them with a zero value");
if (B != A) {
  ierr = MatMissingDiagonal(B,,NULL);CHKERRQ(ierr);
  if (missing) SETERRQ(PETSC_COMM_SELF,PETSC_ERR_ARG_WRONGSTATE,"Bmat 
passed to TSSetIJacobian() must have all diagonal entries set, if they are zero 
you must still set them with a zero value");
}
  }
  if (imex) {
if (!ijacobian) {  /* system was written as Udot = G(t,U) */
  PetscBool assembled;
  ierr = MatZeroEntries(A);CHKERRQ(ierr);
  ierr = MatAssembled(A,);CHKERRQ(ierr);
  if (!assembled) {
ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  }
  ierr = MatShift(A,shift);CHKERRQ(ierr);
  if (A != B) {
ierr = MatZeroEntries(B);CHKERRQ(ierr);
ierr = MatAssembled(B,);CHKERRQ(ierr);
if (!assembled) {
  ierr = MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  ierr = MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
}
ierr = MatShift(B,shift);CHKERRQ(ierr);
  }
}
  } else {
Mat Arhs = NULL,Brhs = NULL;
if (rhsjacobian) {
  ierr = TSGetRHSMats_Private(ts,,);CHKERRQ(ierr);
  ierr = TSComputeRHSJacobian(ts,t,U,Arhs,Brhs);CHKERRQ(ierr);
}
if (Arhs == A) {   /* No IJacobian, so we only have the RHS matrix 
*/
  PetscBool flg;
  ts->rhsjacobian.scale = -1;
  ts->rhsjacobian.shift = shift;
  ierr = SNESGetUseMatrixFree(ts->snes,NULL,);CHKERRQ(ierr);
  /* since -snes_mf_operator uses the full SNES function it does not need 
to be shifted or scaled here */
  if (!flg) {
ierr = MatScale(A,-1);CHKERRQ(ierr);
ierr = MatShift(A,shift);CHKERRQ(ierr);
  }
  if (A != B) {
ierr = MatScale(B,-1);CHKERRQ(ierr);
ierr = MatShift(B,shift);CHKERRQ(ierr);
  }
} else if (Arhs) {  /* Both IJacobian and RHSJacobian */
  MatStructure axpy = DIFFERENT_NONZERO_PATTERN;
  if (!ijacobian) { /* No IJacobian provided, but we have a 
separate RHS matrix */
ierr = MatZeroEntries(A);CHKERRQ(ierr);
ierr = MatShift(A,shift);CHKERRQ(ierr);
if (A != B) {
  ierr = MatZeroEntries(B);CHKERRQ(ierr);
  ierr = MatShift(B,shift);CHKERRQ(ierr);
}
  }
  ierr = MatAXPY(A,-1,Arhs,axpy);CHKERRQ(ierr);
  if (A != B) {
ierr = MatAXPY(B,-1,Brhs,axpy);CHKERRQ(ierr);
  }
}
  }

  Please comment and continue discussion.


> On May 11, 2018, at 5:09 PM, Jed Brown  wrote:
> 
> "Smith, Barry F."  writes:
> 
 The current IJacobian is essentially SNESJacobian. And the single-matrix 
 SNESJacobian interface is always there. Power users could set up the 
 SNESJacobian directly if we pass a read-only shift 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 5:26 PM, Jed Brown  wrote:
> 
> "Smith, Barry F."  writes:
> 
>>> On May 11, 2018, at 5:09 PM, Jed Brown  wrote:
>>> 
>>> "Smith, Barry F."  writes:
>>> 
>> The current IJacobian is essentially SNESJacobian. And the single-matrix 
>> SNESJacobian interface is always there. Power users could set up the 
>> SNESJacobian directly if we pass a read-only shift parameter to them. So 
>> we are by no means prohibiting power users from doing what they want.
> 
> You mean the user would call TSGetSNES and SNESSetJacobian, then
> internally call TSGetIJacobianShift and some new function to create
> Udot?  That sounds way messier and error-prone.
> 
> And completely impossible to compose. We should be explicitly talking 
> about subsystems that use TS. For example,
> I have a nonlinear system for plasticity. I want to use a preconditioner 
> that introduces some elasticity and timesteps to
> steady state to provide a good Newton direction. I need to to be able to 
> create the solver without all sorts of digging
> in and setting things. This idea that you "just set SNESJacobian" is a 
> non-starter.
 
  But this is exactly what TSComputeIJacobian currently does, so is the 
 current interface a non-starter?
>>> 
>>> It isn't at all the current interface.  
>> 
>>   You misunderstand what I am saying. The current interface produces exactly 
>> the Jacobian needed for the SNES solve (nothing more and nothing less) as a 
>> single monolithic matrix. Matt says this is a non-starter.
> 
> Matt says that having the user call TSGetSNES and SNESSetJacobian (my
> interpretation of Hong's "users could set up the SNESJacobian directly")
> to assemble that matrix is a non-starter.

   He said it in a very confusing way and brought in seemly irrelevant 
discussion about some pseudo time stepping inside the needed nonlinear solve.



Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Jed Brown
"Smith, Barry F."  writes:

>> On May 11, 2018, at 5:09 PM, Jed Brown  wrote:
>> 
>> "Smith, Barry F."  writes:
>> 
> The current IJacobian is essentially SNESJacobian. And the single-matrix 
> SNESJacobian interface is always there. Power users could set up the 
> SNESJacobian directly if we pass a read-only shift parameter to them. So 
> we are by no means prohibiting power users from doing what they want.
 
 You mean the user would call TSGetSNES and SNESSetJacobian, then
 internally call TSGetIJacobianShift and some new function to create
 Udot?  That sounds way messier and error-prone.
 
 And completely impossible to compose. We should be explicitly talking 
 about subsystems that use TS. For example,
 I have a nonlinear system for plasticity. I want to use a preconditioner 
 that introduces some elasticity and timesteps to
 steady state to provide a good Newton direction. I need to to be able to 
 create the solver without all sorts of digging
 in and setting things. This idea that you "just set SNESJacobian" is a 
 non-starter.
>>> 
>>>   But this is exactly what TSComputeIJacobian currently does, so is the 
>>> current interface a non-starter?
>> 
>> It isn't at all the current interface.  
>
>You misunderstand what I am saying. The current interface produces exactly 
> the Jacobian needed for the SNES solve (nothing more and nothing less) as a 
> single monolithic matrix. Matt says this is a non-starter.

Matt says that having the user call TSGetSNES and SNESSetJacobian (my
interpretation of Hong's "users could set up the SNESJacobian directly")
to assemble that matrix is a non-starter.


Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 5:09 PM, Jed Brown  wrote:
> 
> "Smith, Barry F."  writes:
> 
 The current IJacobian is essentially SNESJacobian. And the single-matrix 
 SNESJacobian interface is always there. Power users could set up the 
 SNESJacobian directly if we pass a read-only shift parameter to them. So 
 we are by no means prohibiting power users from doing what they want.
>>> 
>>> You mean the user would call TSGetSNES and SNESSetJacobian, then
>>> internally call TSGetIJacobianShift and some new function to create
>>> Udot?  That sounds way messier and error-prone.
>>> 
>>> And completely impossible to compose. We should be explicitly talking about 
>>> subsystems that use TS. For example,
>>> I have a nonlinear system for plasticity. I want to use a preconditioner 
>>> that introduces some elasticity and timesteps to
>>> steady state to provide a good Newton direction. I need to to be able to 
>>> create the solver without all sorts of digging
>>> in and setting things. This idea that you "just set SNESJacobian" is a 
>>> non-starter.
>> 
>>   But this is exactly what TSComputeIJacobian currently does, so is the 
>> current interface a non-starter?
> 
> It isn't at all the current interface.  

   You misunderstand what I am saying. The current interface produces exactly 
the Jacobian needed for the SNES solve (nothing more and nothing less) as a 
single monolithic matrix. Matt says this is a non-starter.

   Barry

> If the user is calling
> SNESSetJacobian, then we would need to open up the bowels of every
> SNESTSFormJacobian_* and make stable public APIs out of those internals
> (including the wiring for nonlinear multigrid).  This sounds like a
> terrible thing to make public.



Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Jed Brown
"Smith, Barry F."  writes:

>> > The current IJacobian is essentially SNESJacobian. And the single-matrix 
>> > SNESJacobian interface is always there. Power users could set up the 
>> > SNESJacobian directly if we pass a read-only shift parameter to them. So 
>> > we are by no means prohibiting power users from doing what they want.
>> 
>> You mean the user would call TSGetSNES and SNESSetJacobian, then
>> internally call TSGetIJacobianShift and some new function to create
>> Udot?  That sounds way messier and error-prone.
>> 
>> And completely impossible to compose. We should be explicitly talking about 
>> subsystems that use TS. For example,
>> I have a nonlinear system for plasticity. I want to use a preconditioner 
>> that introduces some elasticity and timesteps to
>> steady state to provide a good Newton direction. I need to to be able to 
>> create the solver without all sorts of digging
>> in and setting things. This idea that you "just set SNESJacobian" is a 
>> non-starter.
>
>But this is exactly what TSComputeIJacobian currently does, so is the 
> current interface a non-starter?

It isn't at all the current interface.  If the user is calling
SNESSetJacobian, then we would need to open up the bowels of every
SNESTSFormJacobian_* and make stable public APIs out of those internals
(including the wiring for nonlinear multigrid).  This sounds like a
terrible thing to make public.


Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 3:38 PM, Matthew Knepley  wrote:
> 
> On Fri, May 11, 2018 at 4:36 PM, Jed Brown  wrote:
> "Zhang, Hong"  writes:
> 
> > We are not forcing users to do two matrix assemblies per time
> > step. For most cases, there is even no need to update dF/dUdot at
> > all. For extreme cases that the application requires frequent update
> > on dF/dUdot and assembly is expensive, one can always assemble the
> > single-matrix Jacobian and throw it to SNES directly.
> 
> For the vast majority of cases, the mass matrix terms are inexpensive
> and can be summed during each assembly at negligible cost (cheaper than
> accessing those terms from an already-assembled matrix).
> 
> >
> > TS used to be an unusable pile of crap until Jed proposed the marvelous
> > IJacobian interface. Suddenly COMPLEX fully-implicit DAE problems become
> > SIMPLE to express, and a single IFunction/IJacobian pair reusable for
> > different timestepper implementations a reality. And we got an added
> > bounus: this was efficient, it involved a SINGLE global matrix assembly. If
> > the issue is in supporting simpler problems, then we should go for an
> > alternative interface with broken Jacobians, just as Stefano propossed in a
> > previous email. I'm totally in favor of an additional broken Jacobians API,
> > and totally againt the removal of the single-matrix IJacobian interface.
> >
> > The current IJacobian is essentially SNESJacobian. And the single-matrix 
> > SNESJacobian interface is always there. Power users could set up the 
> > SNESJacobian directly if we pass a read-only shift parameter to them. So we 
> > are by no means prohibiting power users from doing what they want.
> 
> You mean the user would call TSGetSNES and SNESSetJacobian, then
> internally call TSGetIJacobianShift and some new function to create
> Udot?  That sounds way messier and error-prone.
> 
> And completely impossible to compose. We should be explicitly talking about 
> subsystems that use TS. For example,
> I have a nonlinear system for plasticity. I want to use a preconditioner that 
> introduces some elasticity and timesteps to
> steady state to provide a good Newton direction. I need to to be able to 
> create the solver without all sorts of digging
> in and setting things. This idea that you "just set SNESJacobian" is a 
> non-starter.

   But this is exactly what TSComputeIJacobian currently does, so is the 
current interface a non-starter?

> 
>   Matt
>  
> 
> > IJacobian with shift mixes TS Jacobian and SNES Jacobian. This is the issue 
> > we need to fix.
> 
> It is just one interface producing exactly the matrix that the solver
> needs.
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/



Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 3:36 PM, Jed Brown  wrote:
> 
> "Zhang, Hong"  writes:
> 
>> We are not forcing users to do two matrix assemblies per time
>> step. For most cases, there is even no need to update dF/dUdot at
>> all. For extreme cases that the application requires frequent update
>> on dF/dUdot and assembly is expensive, one can always assemble the
>> single-matrix Jacobian and throw it to SNES directly.
> 
> For the vast majority of cases, the mass matrix terms are inexpensive
> and can be summed during each assembly at negligible cost (cheaper than
> accessing those terms from an already-assembled matrix).
> 
>> 
>> TS used to be an unusable pile of crap until Jed proposed the marvelous
>> IJacobian interface. Suddenly COMPLEX fully-implicit DAE problems become
>> SIMPLE to express, and a single IFunction/IJacobian pair reusable for
>> different timestepper implementations a reality. And we got an added
>> bounus: this was efficient, it involved a SINGLE global matrix assembly. If
>> the issue is in supporting simpler problems, then we should go for an
>> alternative interface with broken Jacobians, just as Stefano propossed in a
>> previous email. I'm totally in favor of an additional broken Jacobians API,
>> and totally againt the removal of the single-matrix IJacobian interface.
>> 
>> The current IJacobian is essentially SNESJacobian. And the single-matrix 
>> SNESJacobian interface is always there. Power users could set up the 
>> SNESJacobian directly if we pass a read-only shift parameter to them. So we 
>> are by no means prohibiting power users from doing what they want.
> 
> You mean the user would call TSGetSNES and SNESSetJacobian, then
> internally call TSGetIJacobianShift and some new function to create
> Udot?  That sounds way messier and error-prone.
> 
>> IJacobian with shift mixes TS Jacobian and SNES Jacobian. This is the issue 
>> we need to fix.
> 
> It is just one interface producing exactly the matrix that the solver
> needs.

The implicit solver needs it, but the explicit solvers do not, nor do the 
adjoint computations. This is the issue.





Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Matthew Knepley
On Fri, May 11, 2018 at 4:36 PM, Jed Brown  wrote:

> "Zhang, Hong"  writes:
>
> > We are not forcing users to do two matrix assemblies per time
> > step. For most cases, there is even no need to update dF/dUdot at
> > all. For extreme cases that the application requires frequent update
> > on dF/dUdot and assembly is expensive, one can always assemble the
> > single-matrix Jacobian and throw it to SNES directly.
>
> For the vast majority of cases, the mass matrix terms are inexpensive
> and can be summed during each assembly at negligible cost (cheaper than
> accessing those terms from an already-assembled matrix).
>
> >
> > TS used to be an unusable pile of crap until Jed proposed the marvelous
> > IJacobian interface. Suddenly COMPLEX fully-implicit DAE problems become
> > SIMPLE to express, and a single IFunction/IJacobian pair reusable for
> > different timestepper implementations a reality. And we got an added
> > bounus: this was efficient, it involved a SINGLE global matrix assembly.
> If
> > the issue is in supporting simpler problems, then we should go for an
> > alternative interface with broken Jacobians, just as Stefano propossed
> in a
> > previous email. I'm totally in favor of an additional broken Jacobians
> API,
> > and totally againt the removal of the single-matrix IJacobian interface.
> >
> > The current IJacobian is essentially SNESJacobian. And the single-matrix
> SNESJacobian interface is always there. Power users could set up the
> SNESJacobian directly if we pass a read-only shift parameter to them. So we
> are by no means prohibiting power users from doing what they want.
>
> You mean the user would call TSGetSNES and SNESSetJacobian, then
> internally call TSGetIJacobianShift and some new function to create
> Udot?  That sounds way messier and error-prone.


And completely impossible to compose. We should be explicitly talking about
subsystems that use TS. For example,
I have a nonlinear system for plasticity. I want to use a preconditioner
that introduces some elasticity and timesteps to
steady state to provide a good Newton direction. I need to to be able to
create the solver without all sorts of digging
in and setting things. This idea that you "just set SNESJacobian" is a
non-starter.

  Matt


>
> > IJacobian with shift mixes TS Jacobian and SNES Jacobian. This is the
> issue we need to fix.
>
> It is just one interface producing exactly the matrix that the solver
> needs.
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Jed Brown
"Zhang, Hong"  writes:

> We are not forcing users to do two matrix assemblies per time
> step. For most cases, there is even no need to update dF/dUdot at
> all. For extreme cases that the application requires frequent update
> on dF/dUdot and assembly is expensive, one can always assemble the
> single-matrix Jacobian and throw it to SNES directly.

For the vast majority of cases, the mass matrix terms are inexpensive
and can be summed during each assembly at negligible cost (cheaper than
accessing those terms from an already-assembled matrix).

>
> TS used to be an unusable pile of crap until Jed proposed the marvelous
> IJacobian interface. Suddenly COMPLEX fully-implicit DAE problems become
> SIMPLE to express, and a single IFunction/IJacobian pair reusable for
> different timestepper implementations a reality. And we got an added
> bounus: this was efficient, it involved a SINGLE global matrix assembly. If
> the issue is in supporting simpler problems, then we should go for an
> alternative interface with broken Jacobians, just as Stefano propossed in a
> previous email. I'm totally in favor of an additional broken Jacobians API,
> and totally againt the removal of the single-matrix IJacobian interface.
>
> The current IJacobian is essentially SNESJacobian. And the single-matrix 
> SNESJacobian interface is always there. Power users could set up the 
> SNESJacobian directly if we pass a read-only shift parameter to them. So we 
> are by no means prohibiting power users from doing what they want.

You mean the user would call TSGetSNES and SNESSetJacobian, then
internally call TSGetIJacobianShift and some new function to create
Udot?  That sounds way messier and error-prone.

> IJacobian with shift mixes TS Jacobian and SNES Jacobian. This is the issue 
> we need to fix.

It is just one interface producing exactly the matrix that the solver
needs.


Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Emil Constantinescu



On 5/11/18 3:14 PM, Zhang, Hong wrote:



On May 11, 2018, at 1:01 PM, Lisandro Dalcin > wrote:


On Fri, 11 May 2018 at 19:34, Jed Brown > wrote:


"Smith, Barry F." > 
writes:




I assemble the combined system directly.


 How, two sets of calls to MatSetValues(), One for the scaled mass

matrix and one for the Jacobian entries? For a constant mass matrix does
this mean you are recomputing the mass matrix entries each call? Or 
are you
storing the mass matrix entries somewhere?  Or is your mass matrix 
diagonal

only?


 Or do you build element by element the M*shift + J element stiffness

and then insert it with a single MatSetValues() call?


It isn't even built separately at the element scale, just summed
contributions at quadrature points.


That's exactly the way I do it in PetIGA, and the way I would do it in any
other general-purpose FEM-like code. In high-order FEM, Jacobian assembly
may very well account from 50% to 80% of runtime (at least that's my
experience with PetIGA). IMHO, forcing users to have to do TWO global
matrix assemblies per time step is simply unacceptable, both in terms of
memory and runtime.


We are not forcing users to do two matrix assemblies per time step. For 
most cases, there is even no need to update dF/dUdot at all. For extreme 
cases that the application requires frequent update on dF/dUdot and 
assembly is expensive, one can always assemble the single-matrix 
Jacobian and throw it to SNES directly.




TS used to be an unusable pile of crap until Jed proposed the marvelous
IJacobian interface. Suddenly COMPLEX fully-implicit DAE problems become
SIMPLE to express, and a single IFunction/IJacobian pair reusable for
different timestepper implementations a reality. And we got an added
bounus: this was efficient, it involved a SINGLE global matrix 
assembly. If

the issue is in supporting simpler problems, then we should go for an
alternative interface with broken Jacobians, just as Stefano propossed 
in a
previous email. I'm totally in favor of an additional broken Jacobians 
API,

and totally againt the removal of the single-matrix IJacobian interface.


The current IJacobian is essentially SNESJacobian. And the single-matrix 
SNESJacobian interface is always there. Power users could set up the 
SNESJacobian directly if we pass a read-only shift parameter to them. So 
we are by no means prohibiting power users from doing what they want.


So the plan would be to support both? The problem from the TS type point 
of view is that we'd have to support both within all methods if they 
co-exist. That would be really painful now (I'm thinking of arkimex, 
rosw,...) and much more painful in the future.


Emil

IJacobian with shift mixes TS Jacobian and SNES Jacobian. This is the 
issue we need to fix.


Thanks,
Hong


I don't buy the argument that IJacobian with shift is ugly, and that such
API drives users away from TS. At best, this is a documentation problem.
Come on, this is just chain rule, should be kindergarden-level stuff for
PETSc users. If we simplify things for the sake of making things 
simple for

newcomers and beginners and make them annoyingly slow for power users
solving complex problems, we will do a very bad business. This is not
politically correct, but I'm much worried about loosing power users, you
know, those that can eventually make a meaningful contributions to science
and software projects. Beginners and newcomers eventually learn and 
benefit

for common-sense software design, and will eventually appreciate it. I
really hope populism to not win this battle :-)
--
Lisandro Dalcin

Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa 

Office Phone: +966 12 808-0459




Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Zhang, Hong


On May 11, 2018, at 1:01 PM, Lisandro Dalcin 
> wrote:

On Fri, 11 May 2018 at 19:34, Jed Brown 
> wrote:

"Smith, Barry F." > writes:


I assemble the combined system directly.

 How, two sets of calls to MatSetValues(), One for the scaled mass
matrix and one for the Jacobian entries? For a constant mass matrix does
this mean you are recomputing the mass matrix entries each call? Or are you
storing the mass matrix entries somewhere?  Or is your mass matrix diagonal
only?

 Or do you build element by element the M*shift + J element stiffness
and then insert it with a single MatSetValues() call?

It isn't even built separately at the element scale, just summed
contributions at quadrature points.

That's exactly the way I do it in PetIGA, and the way I would do it in any
other general-purpose FEM-like code. In high-order FEM, Jacobian assembly
may very well account from 50% to 80% of runtime (at least that's my
experience with PetIGA). IMHO, forcing users to have to do TWO global
matrix assemblies per time step is simply unacceptable, both in terms of
memory and runtime.

We are not forcing users to do two matrix assemblies per time step. For most 
cases, there is even no need to update dF/dUdot at all. For extreme cases that 
the application requires frequent update on dF/dUdot and assembly is expensive, 
one can always assemble the single-matrix Jacobian and throw it to SNES 
directly.


TS used to be an unusable pile of crap until Jed proposed the marvelous
IJacobian interface. Suddenly COMPLEX fully-implicit DAE problems become
SIMPLE to express, and a single IFunction/IJacobian pair reusable for
different timestepper implementations a reality. And we got an added
bounus: this was efficient, it involved a SINGLE global matrix assembly. If
the issue is in supporting simpler problems, then we should go for an
alternative interface with broken Jacobians, just as Stefano propossed in a
previous email. I'm totally in favor of an additional broken Jacobians API,
and totally againt the removal of the single-matrix IJacobian interface.

The current IJacobian is essentially SNESJacobian. And the single-matrix 
SNESJacobian interface is always there. Power users could set up the 
SNESJacobian directly if we pass a read-only shift parameter to them. So we are 
by no means prohibiting power users from doing what they want.

IJacobian with shift mixes TS Jacobian and SNES Jacobian. This is the issue we 
need to fix.

Thanks,
Hong

I don't buy the argument that IJacobian with shift is ugly, and that such
API drives users away from TS. At best, this is a documentation problem.
Come on, this is just chain rule, should be kindergarden-level stuff for
PETSc users. If we simplify things for the sake of making things simple for
newcomers and beginners and make them annoyingly slow for power users
solving complex problems, we will do a very bad business. This is not
politically correct, but I'm much worried about loosing power users, you
know, those that can eventually make a meaningful contributions to science
and software projects. Beginners and newcomers eventually learn and benefit
for common-sense software design, and will eventually appreciate it. I
really hope populism to not win this battle :-)
--
Lisandro Dalcin

Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459



Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 1:01 PM, Lisandro Dalcin  wrote:
> 
> On Fri, 11 May 2018 at 19:34, Jed Brown  wrote:
> 
>> "Smith, Barry F."  writes:
> 
> 
> I assemble the combined system directly.
 
  How, two sets of calls to MatSetValues(), One for the scaled mass
> matrix and one for the Jacobian entries? For a constant mass matrix does
> this mean you are recomputing the mass matrix entries each call? Or are you
> storing the mass matrix entries somewhere?  Or is your mass matrix diagonal
> only?
>>> 
>>>  Or do you build element by element the M*shift + J element stiffness
> and then insert it with a single MatSetValues() call?
> 
>> It isn't even built separately at the element scale, just summed
>> contributions at quadrature points.
> 
> That's exactly the way I do it in PetIGA, and the way I would do it in any
> other general-purpose FEM-like code. In high-order FEM, Jacobian assembly
> may very well account from 50% to 80% of runtime (at least that's my
> experience with PetIGA). IMHO, forcing users to have to do TWO global
> matrix assemblies per time step is simply unacceptable, both in terms of
> memory and runtime.
> 
> TS used to be an unusable pile of crap until Jed proposed the marvelous
> IJacobian interface. Suddenly COMPLEX fully-implicit DAE problems become
> SIMPLE to express, and a single IFunction/IJacobian pair reusable for
> different timestepper implementations a reality. And we got an added
> bounus: this was efficient, it involved a SINGLE global matrix assembly. If
> the issue is in supporting simpler problems, then we should go for an
> alternative interface with broken Jacobians, just as Stefano propossed in a
> previous email. I'm totally in favor of an additional broken Jacobians API,
> and totally againt the removal of the single-matrix IJacobian interface.

   Ok, thanks.

> 
> I don't buy the argument that IJacobian with shift is ugly,

   Not ugly, hard for most users to understand.

> and that such
> API drives users away from TS. At best, this is a documentation problem.
> Come on, this is just chain rule, should be kindergarden-level stuff for
> PETSc users. If we simplify things for the sake of making things simple for
> newcomers and beginners and make them annoyingly slow for power users
> solving complex problems, we will do a very bad business. This is not
> politically correct, but I'm much worried about loosing power users, you
> know, those that can eventually make a meaningful contributions to science
> and software projects. Beginners and newcomers eventually learn and benefit
> for common-sense software design, and will eventually appreciate it. I
> really hope populism to not win this battle :-)
> 
> -- 
> Lisandro Dalcin
> 
> Research Scientist
> Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
> Extreme Computing Research Center (ECRC)
> King Abdullah University of Science and Technology (KAUST)
> http://ecrc.kaust.edu.sa/
> 
> 4700 King Abdullah University of Science and Technology
> al-Khawarizmi Bldg (Bldg 1), Office # 0109
> Thuwal 23955-6900, Kingdom of Saudi Arabia
> http://www.kaust.edu.sa
> 
> Office Phone: +966 12 808-0459



Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 12:58 PM, Stefano Zampini  
> wrote:
> 
> 
>> On May 11, 2018, at 6:20 PM, Smith, Barry F.  wrote:
>> 
>> 
>> 
>>> On May 11, 2018, at 8:03 AM, Stefano Zampini  
>>> wrote:
>>> 
>>> I don’t think changing the current TS API is best approach.
>>> 
>>> Obtaining separate Jacobians is a need for adjoints and tangent linear 
>>> models only.
>>> This is how I implemented it in stefano_zampini/feature-continuousadjoint
>>> https://bitbucket.org/petsc/petsc/src/7203629c61cbeced536ed8ba1dd2ef85ffb89e8f/src/ts/interface/tssplitjac.c#lines-48
>>> 
>>> Note that, instead of requiring the user to call 
>>> PetscObjectComposeFunction, we can use a function pointer and have 
>>> TSSetComputeSplitJacobians
>> 
>>  So you are proposing keeping TSSetIFunction and TSSetIJacobian and ADDING a 
>> new API TSSetComputeSplitJacobians() and it that is not provided calling 
>> TSComputeIJacobian() twice with different shifts (which is definitely not 
>> efficient and is what Hong does also).
>> 
> 
> Why is it inefficient?

   I meant calling the user ComputeIJacobian twice (to get the two parts) is 
inefficient. Not your API

> If you need BOTH dF/dUdot and dF/dU, you need two different assemblies even 
> if we change the API. Note that the physics is needed to evaluate dF/du, but 
> usually it’s not for dF/dUdot.
> And the MatAXPY is with SAME_NONZERO_PATTERN, so, basically no time.
> The only difference is one extra physics evaluation (that can be expensive). 
> However, advanced users that are aware of that, can provide their specialized 
> version of ComputeJacobians.
> 
> 
>>  Barry
>> 
>>> 
>>> 
>>> 
 On May 11, 2018, at 3:20 PM, Jed Brown  wrote:
 
 "Smith, Barry F."  writes:
 
>> On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
>> 
>> "Zhang, Hong"  writes:
>> 
>>> Dear PETSc folks,
>>> 
>>> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
>>> designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
>>> Shampine's paper 
>>> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
>>>  explains some reasoning behind it.
>>> 
>>> Our formulation is general enough to cover all the following common 
>>> cases
>>> 
>>> *   Udot = G(t,U) (classic ODE)
>>> *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
>>> *   M(t,U) Udot = G(t,U) (PDEs)
>>> 
>>> Yet the TS APIs provide the capability to solve both simple problems 
>>> and complicated problems. However, we are not doing well to make TS 
>>> easy to use and efficient especially for simple problems. Over the 
>>> years, we have clearly seen the main drawbacks including:
>>> 1. The shift parameter exposed in IJacobian is terribly confusing, 
>>> especially to new users. Also it is not conceptually straightforward 
>>> when using AD or finite differences on IFunction to approximate 
>>> IJacobian.
>> 
>> What isn't straightforward about AD or FD on the IFunction?  That one
>> bit of chain rule?
>> 
>>> 2. It is difficult to switch from fully implicit to fully explicit. 
>>> Users cannot use explicit methods when they provide IFunction/IJacobian.
>> 
>> This is a real issue, but it's extremely common for PDE to have boundary
>> conditions enforced as algebraic constraints, thus yielding a DAE.
>> 
>>> 3. The structure of mass matrix is completely invisible to TS. This 
>>> means giving up all the opportunities to improve efficiency. For 
>>> example, when M is constant or weekly dependent on U, we might not want 
>>> to evaluate/update it every time IJacobian is called. If M is diagonal, 
>>> the Jacobian can be shifted more efficiently than just using MatAXPY().
>> 
>> I don't understand 
>> 
>>> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes 
>>> buggy.
>> 
>> Why is "reshifting" needed?  After a step is rejected and when the step
>> size changes for a linear constant operator?
>> 
>>> Consider the scenario below.
>>> shift = a;
>>> TSComputeIJacobian()
>>> shift = b;
>>> TSComputeIJacobian() // with the same U and Udot as last call
>>> Changing the shift parameter requires the Jacobian function to be 
>>> evaluated again. If users provide only RHSJacobian, the Jacobian will 
>>> not be updated/reshifted in the second call because 
>>> TSComputeRHSJacobian() finds out that U has not been changed. This 
>>> issue is fixable by adding more logic into the already convoluted 
>>> implementation of TSComputeIJacobian(), but the intention here is to 
>>> illustrate the 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Stefano Zampini

> On May 11, 2018, at 6:20 PM, Smith, Barry F.  wrote:
> 
> 
> 
>> On May 11, 2018, at 8:03 AM, Stefano Zampini  
>> wrote:
>> 
>> I don’t think changing the current TS API is best approach.
>> 
>> Obtaining separate Jacobians is a need for adjoints and tangent linear 
>> models only.
>> This is how I implemented it in stefano_zampini/feature-continuousadjoint
>> https://bitbucket.org/petsc/petsc/src/7203629c61cbeced536ed8ba1dd2ef85ffb89e8f/src/ts/interface/tssplitjac.c#lines-48
>> 
>> Note that, instead of requiring the user to call PetscObjectComposeFunction, 
>> we can use a function pointer and have TSSetComputeSplitJacobians
> 
>   So you are proposing keeping TSSetIFunction and TSSetIJacobian and ADDING a 
> new API TSSetComputeSplitJacobians() and it that is not provided calling 
> TSComputeIJacobian() twice with different shifts (which is definitely not 
> efficient and is what Hong does also).
> 

Why is it inefficient? If you need BOTH dF/dUdot and dF/dU, you need two 
different assemblies even if we change the API. Note that the physics is needed 
to evaluate dF/du, but usually it’s not for dF/dUdot.
And the MatAXPY is with SAME_NONZERO_PATTERN, so, basically no time.
The only difference is one extra physics evaluation (that can be expensive). 
However, advanced users that are aware of that, can provide their specialized 
version of ComputeJacobians.


>   Barry
> 
>> 
>> 
>> 
>>> On May 11, 2018, at 3:20 PM, Jed Brown  wrote:
>>> 
>>> "Smith, Barry F."  writes:
>>> 
> On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
> 
> "Zhang, Hong"  writes:
> 
>> Dear PETSc folks,
>> 
>> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
>> designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
>> Shampine's paper 
>> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
>>  explains some reasoning behind it.
>> 
>> Our formulation is general enough to cover all the following common cases
>> 
>> *   Udot = G(t,U) (classic ODE)
>> *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
>> *   M(t,U) Udot = G(t,U) (PDEs)
>> 
>> Yet the TS APIs provide the capability to solve both simple problems and 
>> complicated problems. However, we are not doing well to make TS easy to 
>> use and efficient especially for simple problems. Over the years, we 
>> have clearly seen the main drawbacks including:
>> 1. The shift parameter exposed in IJacobian is terribly confusing, 
>> especially to new users. Also it is not conceptually straightforward 
>> when using AD or finite differences on IFunction to approximate 
>> IJacobian.
> 
> What isn't straightforward about AD or FD on the IFunction?  That one
> bit of chain rule?
> 
>> 2. It is difficult to switch from fully implicit to fully explicit. 
>> Users cannot use explicit methods when they provide IFunction/IJacobian.
> 
> This is a real issue, but it's extremely common for PDE to have boundary
> conditions enforced as algebraic constraints, thus yielding a DAE.
> 
>> 3. The structure of mass matrix is completely invisible to TS. This 
>> means giving up all the opportunities to improve efficiency. For 
>> example, when M is constant or weekly dependent on U, we might not want 
>> to evaluate/update it every time IJacobian is called. If M is diagonal, 
>> the Jacobian can be shifted more efficiently than just using MatAXPY().
> 
> I don't understand 
> 
>> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes 
>> buggy.
> 
> Why is "reshifting" needed?  After a step is rejected and when the step
> size changes for a linear constant operator?
> 
>> Consider the scenario below.
>> shift = a;
>> TSComputeIJacobian()
>> shift = b;
>> TSComputeIJacobian() // with the same U and Udot as last call
>> Changing the shift parameter requires the Jacobian function to be 
>> evaluated again. If users provide only RHSJacobian, the Jacobian will 
>> not be updated/reshifted in the second call because 
>> TSComputeRHSJacobian() finds out that U has not been changed. This issue 
>> is fixable by adding more logic into the already convoluted 
>> implementation of TSComputeIJacobian(), but the intention here is to 
>> illustrate the cumbersomeness of current IJacobian and the growing 
>> complications in TSComputeIJacobian() that IJacobian causes.
>> 
>> So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
>> remove the shift parameter from IJacobian. dF/dU will be calculated by 
>> IJacobian; dF/dUdot will be calculated by 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Jed Brown
"Smith, Barry F."  writes:

>> On May 11, 2018, at 10:17 AM, Smith, Barry F.  wrote:
>> 
>> 
>> 
>>> On May 11, 2018, at 7:05 AM, Matthew Knepley  wrote:
>>> 
>>> On Fri, May 11, 2018 at 12:25 AM, Smith, Barry F.  
>>> wrote:
>>> 
>>> 
 On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
 
 "Zhang, Hong"  writes:
 
> Dear PETSc folks,
> 
> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
> designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
> Shampine's paper 
> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
>  explains some reasoning behind it.
> 
> Our formulation is general enough to cover all the following common cases
> 
> *   Udot = G(t,U) (classic ODE)
> *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
> *   M(t,U) Udot = G(t,U) (PDEs)
> 
> Yet the TS APIs provide the capability to solve both simple problems and 
> complicated problems. However, we are not doing well to make TS easy to 
> use and efficient especially for simple problems. Over the years, we have 
> clearly seen the main drawbacks including:
> 1. The shift parameter exposed in IJacobian is terribly confusing, 
> especially to new users. Also it is not conceptually straightforward when 
> using AD or finite differences on IFunction to approximate IJacobian.
 
 What isn't straightforward about AD or FD on the IFunction?  That one
 bit of chain rule?
 
> 2. It is difficult to switch from fully implicit to fully explicit. Users 
> cannot use explicit methods when they provide IFunction/IJacobian.
 
 This is a real issue, but it's extremely common for PDE to have boundary
 conditions enforced as algebraic constraints, thus yielding a DAE.
 
> 3. The structure of mass matrix is completely invisible to TS. This means 
> giving up all the opportunities to improve efficiency. For example, when 
> M is constant or weekly dependent on U, we might not want to 
> evaluate/update it every time IJacobian is called. If M is diagonal, the 
> Jacobian can be shifted more efficiently than just using MatAXPY().
 
 I don't understand 
 
> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
 
 Why is "reshifting" needed?  After a step is rejected and when the step
 size changes for a linear constant operator?
 
> Consider the scenario below.
> shift = a;
> TSComputeIJacobian()
> shift = b;
> TSComputeIJacobian() // with the same U and Udot as last call
> Changing the shift parameter requires the Jacobian function to be 
> evaluated again. If users provide only RHSJacobian, the Jacobian will not 
> be updated/reshifted in the second call because TSComputeRHSJacobian() 
> finds out that U has not been changed. This issue is fixable by adding 
> more logic into the already convoluted implementation of 
> TSComputeIJacobian(), but the intention here is to illustrate the 
> cumbersomeness of current IJacobian and the growing complications in 
> TSComputeIJacobian() that IJacobian causes.
> 
> So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
> remove the shift parameter from IJacobian. dF/dU will be calculated by 
> IJacobian; dF/dUdot will be calculated by a new callback function and 
> default to an identity matrix if it is not provided by users. Then the 
> users do not need to assemble the shifted Jacobian since TS will handle 
> the shifting whenever needed. And knowing the structure of dF/dUdot (the 
> mass matrix), TS will become more flexible. In particular, we will have
> 
> *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
> adjoint implementation a lot),
 
 How does this simplify the adjoint?
 
> *   plenty of opportunities to optimize TS when the mass matrix is 
> diagonal or constant or weekly dependent on U (which accounts for almost 
> all use cases in practice),
 
 But longer critical path,
>>> 
>>>   What do you mean by longer critical path?
>>> 
 more storage required, and more data motion.
>>> 
>>>  The extra storage needed is related to the size of the mass matrix 
>>> correct? And the extra data motion is related to the size of the mass 
>>> matrix correct?
>>> 
>>>   Is the extra work coming from a needed call to MatAXPY (to combine the 
>>> scaled mass matrix with the Jacobian) in Hong's approach? While, in theory, 
>>> the user can avoid the MatAXPY in the current code if they have custom code 
>>> that assembles directly the scaled mass matrix and Jacobian? But surely 
>>> most 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Jed Brown
"Smith, Barry F."  writes:

>> On May 11, 2018, at 7:20 AM, Jed Brown  wrote:
>> 
>> "Smith, Barry F."  writes:
>> 
 On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
 
 "Zhang, Hong"  writes:
 
> Dear PETSc folks,
> 
> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
> designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
> Shampine's paper 
> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
>  explains some reasoning behind it.
> 
> Our formulation is general enough to cover all the following common cases
> 
> *   Udot = G(t,U) (classic ODE)
> *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
> *   M(t,U) Udot = G(t,U) (PDEs)
> 
> Yet the TS APIs provide the capability to solve both simple problems and 
> complicated problems. However, we are not doing well to make TS easy to 
> use and efficient especially for simple problems. Over the years, we have 
> clearly seen the main drawbacks including:
> 1. The shift parameter exposed in IJacobian is terribly confusing, 
> especially to new users. Also it is not conceptually straightforward when 
> using AD or finite differences on IFunction to approximate IJacobian.
 
 What isn't straightforward about AD or FD on the IFunction?  That one
 bit of chain rule?
 
> 2. It is difficult to switch from fully implicit to fully explicit. Users 
> cannot use explicit methods when they provide IFunction/IJacobian.
 
 This is a real issue, but it's extremely common for PDE to have boundary
 conditions enforced as algebraic constraints, thus yielding a DAE.
 
> 3. The structure of mass matrix is completely invisible to TS. This means 
> giving up all the opportunities to improve efficiency. For example, when 
> M is constant or weekly dependent on U, we might not want to 
> evaluate/update it every time IJacobian is called. If M is diagonal, the 
> Jacobian can be shifted more efficiently than just using MatAXPY().
 
 I don't understand 
 
> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
 
 Why is "reshifting" needed?  After a step is rejected and when the step
 size changes for a linear constant operator?
 
> Consider the scenario below.
> shift = a;
> TSComputeIJacobian()
> shift = b;
> TSComputeIJacobian() // with the same U and Udot as last call
> Changing the shift parameter requires the Jacobian function to be 
> evaluated again. If users provide only RHSJacobian, the Jacobian will not 
> be updated/reshifted in the second call because TSComputeRHSJacobian() 
> finds out that U has not been changed. This issue is fixable by adding 
> more logic into the already convoluted implementation of 
> TSComputeIJacobian(), but the intention here is to illustrate the 
> cumbersomeness of current IJacobian and the growing complications in 
> TSComputeIJacobian() that IJacobian causes.
> 
> So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
> remove the shift parameter from IJacobian. dF/dU will be calculated by 
> IJacobian; dF/dUdot will be calculated by a new callback function and 
> default to an identity matrix if it is not provided by users. Then the 
> users do not need to assemble the shifted Jacobian since TS will handle 
> the shifting whenever needed. And knowing the structure of dF/dUdot (the 
> mass matrix), TS will become more flexible. In particular, we will have
> 
> *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
> adjoint implementation a lot),
 
 How does this simplify the adjoint?
 
> *   plenty of opportunities to optimize TS when the mass matrix is 
> diagonal or constant or weekly dependent on U (which accounts for almost 
> all use cases in practice),
 
 But longer critical path,
>>> 
>>>   What do you mean by longer critical path?
>> 
>> Create Mass (dF/dUdot) matrix, call MatAssembly, create dF/dU, call
>> MatAssembly, call MatAXPY (involves another MatAssembly unless
>> SAME_NONZERO_PATTERN).  That's a long sequence for what could be one
>> MatAssembly.  Also note that geometric setup for elements is usually
>> done in each element loop.  For simple physics, this is way more
>> expensive than the physics (certainly the case for LibMesh and Deal.II).
>> 
 more storage required, and more data motion.
>>> 
>>>  The extra storage needed is related to the size of the mass matrix 
>>> correct? And the extra data motion is related to the size of the mass 
>>> matrix correct?
>> 
>> Yes, which is the same as the 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 10:17 AM, Smith, Barry F.  wrote:
> 
> 
> 
>> On May 11, 2018, at 7:05 AM, Matthew Knepley  wrote:
>> 
>> On Fri, May 11, 2018 at 12:25 AM, Smith, Barry F.  wrote:
>> 
>> 
>>> On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
>>> 
>>> "Zhang, Hong"  writes:
>>> 
 Dear PETSc folks,
 
 Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
 designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
 Shampine's paper 
 (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
  explains some reasoning behind it.
 
 Our formulation is general enough to cover all the following common cases
 
 *   Udot = G(t,U) (classic ODE)
 *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
 *   M(t,U) Udot = G(t,U) (PDEs)
 
 Yet the TS APIs provide the capability to solve both simple problems and 
 complicated problems. However, we are not doing well to make TS easy to 
 use and efficient especially for simple problems. Over the years, we have 
 clearly seen the main drawbacks including:
 1. The shift parameter exposed in IJacobian is terribly confusing, 
 especially to new users. Also it is not conceptually straightforward when 
 using AD or finite differences on IFunction to approximate IJacobian.
>>> 
>>> What isn't straightforward about AD or FD on the IFunction?  That one
>>> bit of chain rule?
>>> 
 2. It is difficult to switch from fully implicit to fully explicit. Users 
 cannot use explicit methods when they provide IFunction/IJacobian.
>>> 
>>> This is a real issue, but it's extremely common for PDE to have boundary
>>> conditions enforced as algebraic constraints, thus yielding a DAE.
>>> 
 3. The structure of mass matrix is completely invisible to TS. This means 
 giving up all the opportunities to improve efficiency. For example, when M 
 is constant or weekly dependent on U, we might not want to evaluate/update 
 it every time IJacobian is called. If M is diagonal, the Jacobian can be 
 shifted more efficiently than just using MatAXPY().
>>> 
>>> I don't understand 
>>> 
 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
>>> 
>>> Why is "reshifting" needed?  After a step is rejected and when the step
>>> size changes for a linear constant operator?
>>> 
 Consider the scenario below.
 shift = a;
 TSComputeIJacobian()
 shift = b;
 TSComputeIJacobian() // with the same U and Udot as last call
 Changing the shift parameter requires the Jacobian function to be 
 evaluated again. If users provide only RHSJacobian, the Jacobian will not 
 be updated/reshifted in the second call because TSComputeRHSJacobian() 
 finds out that U has not been changed. This issue is fixable by adding 
 more logic into the already convoluted implementation of 
 TSComputeIJacobian(), but the intention here is to illustrate the 
 cumbersomeness of current IJacobian and the growing complications in 
 TSComputeIJacobian() that IJacobian causes.
 
 So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
 remove the shift parameter from IJacobian. dF/dU will be calculated by 
 IJacobian; dF/dUdot will be calculated by a new callback function and 
 default to an identity matrix if it is not provided by users. Then the 
 users do not need to assemble the shifted Jacobian since TS will handle 
 the shifting whenever needed. And knowing the structure of dF/dUdot (the 
 mass matrix), TS will become more flexible. In particular, we will have
 
 *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
 adjoint implementation a lot),
>>> 
>>> How does this simplify the adjoint?
>>> 
 *   plenty of opportunities to optimize TS when the mass matrix is 
 diagonal or constant or weekly dependent on U (which accounts for almost 
 all use cases in practice),
>>> 
>>> But longer critical path,
>> 
>>   What do you mean by longer critical path?
>> 
>>> more storage required, and more data motion.
>> 
>>  The extra storage needed is related to the size of the mass matrix correct? 
>> And the extra data motion is related to the size of the mass matrix correct?
>> 
>>   Is the extra work coming from a needed call to MatAXPY (to combine the 
>> scaled mass matrix with the Jacobian) in Hong's approach? While, in theory, 
>> the user can avoid the MatAXPY in the current code if they have custom code 
>> that assembles directly the scaled mass matrix and Jacobian? But surely most 
>> users would not write such custom code and would themselves keep a copy of 
>> the mass matrix (likely constant) and use MatAXPY() to combine the copy of 
>> 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Lisandro Dalcin
On Fri, 11 May 2018 at 18:20, Smith, Barry F.  wrote:

> So you are proposing keeping TSSetIFunction and TSSetIJacobian and
ADDING a new API TSSetComputeSplitJacobians() and it that is not provided
calling TSComputeIJacobian() twice with different shifts (which is
definitely not efficient and is what Hong does also).


Yes, that's basically the proposal. User could even provide both IJacobian
and SplitJacobians, and TS can then make the best use of these callbacks
depending on what is more efficient.


-- 
Lisandro Dalcin

Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459


Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Zhang, Hong
Before we go down the rabbit hole, let me reiterate the primary point: an 
unfriendly API breaks the deal in the first place. Perhaps we should reflect on 
why many other software use PETSc just as a nonlinear/linear solver and 
implement their own time stepper instead of using TS. FWIW I think the weird 
IJacobian with shift is not user-friendly.

On May 11, 2018, at 7:20 AM, Jed Brown 
> wrote:

"Smith, Barry F." > writes:

On May 10, 2018, at 4:12 PM, Jed Brown 
> wrote:

"Zhang, Hong" > writes:

Dear PETSc folks,

Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were designed for 
the fully implicit formulation F(t,U,Udot) = G(t,U).
Shampine's paper 
(https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
 explains some reasoning behind it.

Our formulation is general enough to cover all the following common cases

*   Udot = G(t,U) (classic ODE)
*   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
*   M(t,U) Udot = G(t,U) (PDEs)

Yet the TS APIs provide the capability to solve both simple problems and 
complicated problems. However, we are not doing well to make TS easy to use and 
efficient especially for simple problems. Over the years, we have clearly seen 
the main drawbacks including:
1. The shift parameter exposed in IJacobian is terribly confusing, especially 
to new users. Also it is not conceptually straightforward when using AD or 
finite differences on IFunction to approximate IJacobian.

What isn't straightforward about AD or FD on the IFunction?  That one
bit of chain rule?

2. It is difficult to switch from fully implicit to fully explicit. Users 
cannot use explicit methods when they provide IFunction/IJacobian.

This is a real issue, but it's extremely common for PDE to have boundary
conditions enforced as algebraic constraints, thus yielding a DAE.

3. The structure of mass matrix is completely invisible to TS. This means 
giving up all the opportunities to improve efficiency. For example, when M is 
constant or weekly dependent on U, we might not want to evaluate/update it 
every time IJacobian is called. If M is diagonal, the Jacobian can be shifted 
more efficiently than just using MatAXPY().

I don't understand

4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.

Why is "reshifting" needed?  After a step is rejected and when the step
size changes for a linear constant operator?

Consider the scenario below.
shift = a;
TSComputeIJacobian()
shift = b;
TSComputeIJacobian() // with the same U and Udot as last call
Changing the shift parameter requires the Jacobian function to be evaluated 
again. If users provide only RHSJacobian, the Jacobian will not be 
updated/reshifted in the second call because TSComputeRHSJacobian() finds out 
that U has not been changed. This issue is fixable by adding more logic into 
the already convoluted implementation of TSComputeIJacobian(), but the 
intention here is to illustrate the cumbersomeness of current IJacobian and the 
growing complications in TSComputeIJacobian() that IJacobian causes.

So I propose that we have two separate matrices dF/dUdot and dF/dU, and remove 
the shift parameter from IJacobian. dF/dU will be calculated by IJacobian; 
dF/dUdot will be calculated by a new callback function and default to an 
identity matrix if it is not provided by users. Then the users do not need to 
assemble the shifted Jacobian since TS will handle the shifting whenever 
needed. And knowing the structure of dF/dUdot (the mass matrix), TS will become 
more flexible. In particular, we will have

*   easy access to the unshifted Jacobian dF/dU (this simplifies the adjoint 
implementation a lot),

How does this simplify the adjoint?

*   plenty of opportunities to optimize TS when the mass matrix is diagonal or 
constant or weekly dependent on U (which accounts for almost all use cases in 
practice),

But longer critical path,

  What do you mean by longer critical path?

Create Mass (dF/dUdot) matrix, call MatAssembly, create dF/dU, call
MatAssembly, call MatAXPY (involves another MatAssembly unless
SAME_NONZERO_PATTERN).  That's a long sequence for what could be one
MatAssembly.  Also note that geometric setup for elements is usually
done in each element loop.  For simple physics, this is way more
expensive than the physics (certainly the case for LibMesh and Deal.II).

So the benefit of asking users to provide the shifted Jacobian is that they 
could use less MatAssembly inside IJacobian. But what I proposed aims for the 
benefit of reducing the number of Jacobian or Mass matrix evaluations.
Consider how we handle the following simple cases in the new approach:
1. Udot = G(t,U) -- users do not need to provide dF/dUdot, no 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 7:20 AM, Jed Brown  wrote:
> 
> "Smith, Barry F."  writes:
> 
>>> On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
>>> 
>>> "Zhang, Hong"  writes:
>>> 
 Dear PETSc folks,
 
 Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
 designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
 Shampine's paper 
 (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
  explains some reasoning behind it.
 
 Our formulation is general enough to cover all the following common cases
 
 *   Udot = G(t,U) (classic ODE)
 *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
 *   M(t,U) Udot = G(t,U) (PDEs)
 
 Yet the TS APIs provide the capability to solve both simple problems and 
 complicated problems. However, we are not doing well to make TS easy to 
 use and efficient especially for simple problems. Over the years, we have 
 clearly seen the main drawbacks including:
 1. The shift parameter exposed in IJacobian is terribly confusing, 
 especially to new users. Also it is not conceptually straightforward when 
 using AD or finite differences on IFunction to approximate IJacobian.
>>> 
>>> What isn't straightforward about AD or FD on the IFunction?  That one
>>> bit of chain rule?
>>> 
 2. It is difficult to switch from fully implicit to fully explicit. Users 
 cannot use explicit methods when they provide IFunction/IJacobian.
>>> 
>>> This is a real issue, but it's extremely common for PDE to have boundary
>>> conditions enforced as algebraic constraints, thus yielding a DAE.
>>> 
 3. The structure of mass matrix is completely invisible to TS. This means 
 giving up all the opportunities to improve efficiency. For example, when M 
 is constant or weekly dependent on U, we might not want to evaluate/update 
 it every time IJacobian is called. If M is diagonal, the Jacobian can be 
 shifted more efficiently than just using MatAXPY().
>>> 
>>> I don't understand 
>>> 
 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
>>> 
>>> Why is "reshifting" needed?  After a step is rejected and when the step
>>> size changes for a linear constant operator?
>>> 
 Consider the scenario below.
 shift = a;
 TSComputeIJacobian()
 shift = b;
 TSComputeIJacobian() // with the same U and Udot as last call
 Changing the shift parameter requires the Jacobian function to be 
 evaluated again. If users provide only RHSJacobian, the Jacobian will not 
 be updated/reshifted in the second call because TSComputeRHSJacobian() 
 finds out that U has not been changed. This issue is fixable by adding 
 more logic into the already convoluted implementation of 
 TSComputeIJacobian(), but the intention here is to illustrate the 
 cumbersomeness of current IJacobian and the growing complications in 
 TSComputeIJacobian() that IJacobian causes.
 
 So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
 remove the shift parameter from IJacobian. dF/dU will be calculated by 
 IJacobian; dF/dUdot will be calculated by a new callback function and 
 default to an identity matrix if it is not provided by users. Then the 
 users do not need to assemble the shifted Jacobian since TS will handle 
 the shifting whenever needed. And knowing the structure of dF/dUdot (the 
 mass matrix), TS will become more flexible. In particular, we will have
 
 *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
 adjoint implementation a lot),
>>> 
>>> How does this simplify the adjoint?
>>> 
 *   plenty of opportunities to optimize TS when the mass matrix is 
 diagonal or constant or weekly dependent on U (which accounts for almost 
 all use cases in practice),
>>> 
>>> But longer critical path,
>> 
>>   What do you mean by longer critical path?
> 
> Create Mass (dF/dUdot) matrix, call MatAssembly, create dF/dU, call
> MatAssembly, call MatAXPY (involves another MatAssembly unless
> SAME_NONZERO_PATTERN).  That's a long sequence for what could be one
> MatAssembly.  Also note that geometric setup for elements is usually
> done in each element loop.  For simple physics, this is way more
> expensive than the physics (certainly the case for LibMesh and Deal.II).
> 
>>> more storage required, and more data motion.
>> 
>>  The extra storage needed is related to the size of the mass matrix correct? 
>> And the extra data motion is related to the size of the mass matrix correct?
> 
> Yes, which is the same as the stiffness matrix for finite element methods.

  If the stiffness matrix is the same size (nonzero pattern) as the operator 
then MatAXPY() is fast axpy(), no 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 8:03 AM, Stefano Zampini  
> wrote:
> 
> I don’t think changing the current TS API is best approach.
> 
> Obtaining separate Jacobians is a need for adjoints and tangent linear models 
> only.
> This is how I implemented it in stefano_zampini/feature-continuousadjoint
> https://bitbucket.org/petsc/petsc/src/7203629c61cbeced536ed8ba1dd2ef85ffb89e8f/src/ts/interface/tssplitjac.c#lines-48
> 
> Note that, instead of requiring the user to call PetscObjectComposeFunction, 
> we can use a function pointer and have TSSetComputeSplitJacobians

   So you are proposing keeping TSSetIFunction and TSSetIJacobian and ADDING a 
new API TSSetComputeSplitJacobians() and it that is not provided calling 
TSComputeIJacobian() twice with different shifts (which is definitely not 
efficient and is what Hong does also).

   Barry

> 
> 
> 
>> On May 11, 2018, at 3:20 PM, Jed Brown  wrote:
>> 
>> "Smith, Barry F."  writes:
>> 
 On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
 
 "Zhang, Hong"  writes:
 
> Dear PETSc folks,
> 
> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
> designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
> Shampine's paper 
> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
>  explains some reasoning behind it.
> 
> Our formulation is general enough to cover all the following common cases
> 
> *   Udot = G(t,U) (classic ODE)
> *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
> *   M(t,U) Udot = G(t,U) (PDEs)
> 
> Yet the TS APIs provide the capability to solve both simple problems and 
> complicated problems. However, we are not doing well to make TS easy to 
> use and efficient especially for simple problems. Over the years, we have 
> clearly seen the main drawbacks including:
> 1. The shift parameter exposed in IJacobian is terribly confusing, 
> especially to new users. Also it is not conceptually straightforward when 
> using AD or finite differences on IFunction to approximate IJacobian.
 
 What isn't straightforward about AD or FD on the IFunction?  That one
 bit of chain rule?
 
> 2. It is difficult to switch from fully implicit to fully explicit. Users 
> cannot use explicit methods when they provide IFunction/IJacobian.
 
 This is a real issue, but it's extremely common for PDE to have boundary
 conditions enforced as algebraic constraints, thus yielding a DAE.
 
> 3. The structure of mass matrix is completely invisible to TS. This means 
> giving up all the opportunities to improve efficiency. For example, when 
> M is constant or weekly dependent on U, we might not want to 
> evaluate/update it every time IJacobian is called. If M is diagonal, the 
> Jacobian can be shifted more efficiently than just using MatAXPY().
 
 I don't understand 
 
> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
 
 Why is "reshifting" needed?  After a step is rejected and when the step
 size changes for a linear constant operator?
 
> Consider the scenario below.
> shift = a;
> TSComputeIJacobian()
> shift = b;
> TSComputeIJacobian() // with the same U and Udot as last call
> Changing the shift parameter requires the Jacobian function to be 
> evaluated again. If users provide only RHSJacobian, the Jacobian will not 
> be updated/reshifted in the second call because TSComputeRHSJacobian() 
> finds out that U has not been changed. This issue is fixable by adding 
> more logic into the already convoluted implementation of 
> TSComputeIJacobian(), but the intention here is to illustrate the 
> cumbersomeness of current IJacobian and the growing complications in 
> TSComputeIJacobian() that IJacobian causes.
> 
> So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
> remove the shift parameter from IJacobian. dF/dU will be calculated by 
> IJacobian; dF/dUdot will be calculated by a new callback function and 
> default to an identity matrix if it is not provided by users. Then the 
> users do not need to assemble the shifted Jacobian since TS will handle 
> the shifting whenever needed. And knowing the structure of dF/dUdot (the 
> mass matrix), TS will become more flexible. In particular, we will have
> 
> *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
> adjoint implementation a lot),
 
 How does this simplify the adjoint?
 
> *   plenty of opportunities to optimize TS when the mass matrix is 
> diagonal or constant or weekly dependent on U (which accounts for 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Smith, Barry F.


> On May 11, 2018, at 7:05 AM, Matthew Knepley  wrote:
> 
> On Fri, May 11, 2018 at 12:25 AM, Smith, Barry F.  wrote:
> 
> 
> > On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
> > 
> > "Zhang, Hong"  writes:
> > 
> >> Dear PETSc folks,
> >> 
> >> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
> >> designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
> >> Shampine's paper 
> >> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
> >>  explains some reasoning behind it.
> >> 
> >> Our formulation is general enough to cover all the following common cases
> >> 
> >>  *   Udot = G(t,U) (classic ODE)
> >>  *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
> >>  *   M(t,U) Udot = G(t,U) (PDEs)
> >> 
> >> Yet the TS APIs provide the capability to solve both simple problems and 
> >> complicated problems. However, we are not doing well to make TS easy to 
> >> use and efficient especially for simple problems. Over the years, we have 
> >> clearly seen the main drawbacks including:
> >> 1. The shift parameter exposed in IJacobian is terribly confusing, 
> >> especially to new users. Also it is not conceptually straightforward when 
> >> using AD or finite differences on IFunction to approximate IJacobian.
> > 
> > What isn't straightforward about AD or FD on the IFunction?  That one
> > bit of chain rule?
> > 
> >> 2. It is difficult to switch from fully implicit to fully explicit. Users 
> >> cannot use explicit methods when they provide IFunction/IJacobian.
> > 
> > This is a real issue, but it's extremely common for PDE to have boundary
> > conditions enforced as algebraic constraints, thus yielding a DAE.
> > 
> >> 3. The structure of mass matrix is completely invisible to TS. This means 
> >> giving up all the opportunities to improve efficiency. For example, when M 
> >> is constant or weekly dependent on U, we might not want to evaluate/update 
> >> it every time IJacobian is called. If M is diagonal, the Jacobian can be 
> >> shifted more efficiently than just using MatAXPY().
> > 
> > I don't understand 
> > 
> >> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
> > 
> > Why is "reshifting" needed?  After a step is rejected and when the step
> > size changes for a linear constant operator?
> > 
> >> Consider the scenario below.
> >> shift = a;
> >>  TSComputeIJacobian()
> >>  shift = b;
> >>  TSComputeIJacobian() // with the same U and Udot as last call
> >> Changing the shift parameter requires the Jacobian function to be 
> >> evaluated again. If users provide only RHSJacobian, the Jacobian will not 
> >> be updated/reshifted in the second call because TSComputeRHSJacobian() 
> >> finds out that U has not been changed. This issue is fixable by adding 
> >> more logic into the already convoluted implementation of 
> >> TSComputeIJacobian(), but the intention here is to illustrate the 
> >> cumbersomeness of current IJacobian and the growing complications in 
> >> TSComputeIJacobian() that IJacobian causes.
> >> 
> >> So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
> >> remove the shift parameter from IJacobian. dF/dU will be calculated by 
> >> IJacobian; dF/dUdot will be calculated by a new callback function and 
> >> default to an identity matrix if it is not provided by users. Then the 
> >> users do not need to assemble the shifted Jacobian since TS will handle 
> >> the shifting whenever needed. And knowing the structure of dF/dUdot (the 
> >> mass matrix), TS will become more flexible. In particular, we will have
> >> 
> >>  *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
> >> adjoint implementation a lot),
> > 
> > How does this simplify the adjoint?
> > 
> >>  *   plenty of opportunities to optimize TS when the mass matrix is 
> >> diagonal or constant or weekly dependent on U (which accounts for almost 
> >> all use cases in practice),
> > 
> > But longer critical path,
> 
>What do you mean by longer critical path?
> 
> > more storage required, and more data motion.
> 
>   The extra storage needed is related to the size of the mass matrix correct? 
> And the extra data motion is related to the size of the mass matrix correct?
> 
>Is the extra work coming from a needed call to MatAXPY (to combine the 
> scaled mass matrix with the Jacobian) in Hong's approach? While, in theory, 
> the user can avoid the MatAXPY in the current code if they have custom code 
> that assembles directly the scaled mass matrix and Jacobian? But surely most 
> users would not write such custom code and would themselves keep a copy of 
> the mass matrix (likely constant) and use MatAXPY() to combine the copy of 
> the mass matrix with the Jacobian they compute at each timestep/stage?  Or am 
> I missing 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Stefano Zampini
I don’t think changing the current TS API is best approach.

Obtaining separate Jacobians is a need for adjoints and tangent linear models 
only.
This is how I implemented it in stefano_zampini/feature-continuousadjoint
https://bitbucket.org/petsc/petsc/src/7203629c61cbeced536ed8ba1dd2ef85ffb89e8f/src/ts/interface/tssplitjac.c#lines-48
 


Note that, instead of requiring the user to call PetscObjectComposeFunction, we 
can use a function pointer and have TSSetComputeSplitJacobians



> On May 11, 2018, at 3:20 PM, Jed Brown  wrote:
> 
> "Smith, Barry F." > writes:
> 
>>> On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
>>> 
>>> "Zhang, Hong"  writes:
>>> 
 Dear PETSc folks,
 
 Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
 designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
 Shampine's paper 
 (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
  explains some reasoning behind it.
 
 Our formulation is general enough to cover all the following common cases
 
 *   Udot = G(t,U) (classic ODE)
 *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
 *   M(t,U) Udot = G(t,U) (PDEs)
 
 Yet the TS APIs provide the capability to solve both simple problems and 
 complicated problems. However, we are not doing well to make TS easy to 
 use and efficient especially for simple problems. Over the years, we have 
 clearly seen the main drawbacks including:
 1. The shift parameter exposed in IJacobian is terribly confusing, 
 especially to new users. Also it is not conceptually straightforward when 
 using AD or finite differences on IFunction to approximate IJacobian.
>>> 
>>> What isn't straightforward about AD or FD on the IFunction?  That one
>>> bit of chain rule?
>>> 
 2. It is difficult to switch from fully implicit to fully explicit. Users 
 cannot use explicit methods when they provide IFunction/IJacobian.
>>> 
>>> This is a real issue, but it's extremely common for PDE to have boundary
>>> conditions enforced as algebraic constraints, thus yielding a DAE.
>>> 
 3. The structure of mass matrix is completely invisible to TS. This means 
 giving up all the opportunities to improve efficiency. For example, when M 
 is constant or weekly dependent on U, we might not want to evaluate/update 
 it every time IJacobian is called. If M is diagonal, the Jacobian can be 
 shifted more efficiently than just using MatAXPY().
>>> 
>>> I don't understand 
>>> 
 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
>>> 
>>> Why is "reshifting" needed?  After a step is rejected and when the step
>>> size changes for a linear constant operator?
>>> 
 Consider the scenario below.
 shift = a;
 TSComputeIJacobian()
 shift = b;
 TSComputeIJacobian() // with the same U and Udot as last call
 Changing the shift parameter requires the Jacobian function to be 
 evaluated again. If users provide only RHSJacobian, the Jacobian will not 
 be updated/reshifted in the second call because TSComputeRHSJacobian() 
 finds out that U has not been changed. This issue is fixable by adding 
 more logic into the already convoluted implementation of 
 TSComputeIJacobian(), but the intention here is to illustrate the 
 cumbersomeness of current IJacobian and the growing complications in 
 TSComputeIJacobian() that IJacobian causes.
 
 So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
 remove the shift parameter from IJacobian. dF/dU will be calculated by 
 IJacobian; dF/dUdot will be calculated by a new callback function and 
 default to an identity matrix if it is not provided by users. Then the 
 users do not need to assemble the shifted Jacobian since TS will handle 
 the shifting whenever needed. And knowing the structure of dF/dUdot (the 
 mass matrix), TS will become more flexible. In particular, we will have
 
 *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
 adjoint implementation a lot),
>>> 
>>> How does this simplify the adjoint?
>>> 
 *   plenty of opportunities to optimize TS when the mass matrix is 
 diagonal or constant or weekly dependent on U (which accounts for almost 
 all use cases in practice),
>>> 
>>> But longer critical path,
>> 
>>   What do you mean by longer critical path?
> 
> Create Mass (dF/dUdot) matrix, call MatAssembly, create dF/dU, call
> MatAssembly, call MatAXPY (involves another MatAssembly unless
> SAME_NONZERO_PATTERN).  That's a long sequence for what 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Jed Brown
"Smith, Barry F."  writes:

>> On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
>> 
>> "Zhang, Hong"  writes:
>> 
>>> Dear PETSc folks,
>>> 
>>> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were designed 
>>> for the fully implicit formulation F(t,U,Udot) = G(t,U).
>>> Shampine's paper 
>>> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
>>>  explains some reasoning behind it.
>>> 
>>> Our formulation is general enough to cover all the following common cases
>>> 
>>>  *   Udot = G(t,U) (classic ODE)
>>>  *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
>>>  *   M(t,U) Udot = G(t,U) (PDEs)
>>> 
>>> Yet the TS APIs provide the capability to solve both simple problems and 
>>> complicated problems. However, we are not doing well to make TS easy to use 
>>> and efficient especially for simple problems. Over the years, we have 
>>> clearly seen the main drawbacks including:
>>> 1. The shift parameter exposed in IJacobian is terribly confusing, 
>>> especially to new users. Also it is not conceptually straightforward when 
>>> using AD or finite differences on IFunction to approximate IJacobian.
>> 
>> What isn't straightforward about AD or FD on the IFunction?  That one
>> bit of chain rule?
>> 
>>> 2. It is difficult to switch from fully implicit to fully explicit. Users 
>>> cannot use explicit methods when they provide IFunction/IJacobian.
>> 
>> This is a real issue, but it's extremely common for PDE to have boundary
>> conditions enforced as algebraic constraints, thus yielding a DAE.
>> 
>>> 3. The structure of mass matrix is completely invisible to TS. This means 
>>> giving up all the opportunities to improve efficiency. For example, when M 
>>> is constant or weekly dependent on U, we might not want to evaluate/update 
>>> it every time IJacobian is called. If M is diagonal, the Jacobian can be 
>>> shifted more efficiently than just using MatAXPY().
>> 
>> I don't understand 
>> 
>>> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
>> 
>> Why is "reshifting" needed?  After a step is rejected and when the step
>> size changes for a linear constant operator?
>> 
>>> Consider the scenario below.
>>> shift = a;
>>>  TSComputeIJacobian()
>>>  shift = b;
>>>  TSComputeIJacobian() // with the same U and Udot as last call
>>> Changing the shift parameter requires the Jacobian function to be evaluated 
>>> again. If users provide only RHSJacobian, the Jacobian will not be 
>>> updated/reshifted in the second call because TSComputeRHSJacobian() finds 
>>> out that U has not been changed. This issue is fixable by adding more logic 
>>> into the already convoluted implementation of TSComputeIJacobian(), but the 
>>> intention here is to illustrate the cumbersomeness of current IJacobian and 
>>> the growing complications in TSComputeIJacobian() that IJacobian causes.
>>> 
>>> So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
>>> remove the shift parameter from IJacobian. dF/dU will be calculated by 
>>> IJacobian; dF/dUdot will be calculated by a new callback function and 
>>> default to an identity matrix if it is not provided by users. Then the 
>>> users do not need to assemble the shifted Jacobian since TS will handle the 
>>> shifting whenever needed. And knowing the structure of dF/dUdot (the mass 
>>> matrix), TS will become more flexible. In particular, we will have
>>> 
>>>  *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
>>> adjoint implementation a lot),
>> 
>> How does this simplify the adjoint?
>> 
>>>  *   plenty of opportunities to optimize TS when the mass matrix is 
>>> diagonal or constant or weekly dependent on U (which accounts for almost 
>>> all use cases in practice),
>> 
>> But longer critical path,
>
>What do you mean by longer critical path?

Create Mass (dF/dUdot) matrix, call MatAssembly, create dF/dU, call
MatAssembly, call MatAXPY (involves another MatAssembly unless
SAME_NONZERO_PATTERN).  That's a long sequence for what could be one
MatAssembly.  Also note that geometric setup for elements is usually
done in each element loop.  For simple physics, this is way more
expensive than the physics (certainly the case for LibMesh and Deal.II).

>> more storage required, and more data motion.
>
>   The extra storage needed is related to the size of the mass matrix correct? 
> And the extra data motion is related to the size of the mass matrix correct?

Yes, which is the same as the stiffness matrix for finite element methods.

>Is the extra work coming from a needed call to MatAXPY (to combine the 
> scaled mass matrix with the Jacobian) in Hong's approach? While, in theory, 
> the user can avoid the MatAXPY in the current code if they have custom code 
> that assembles directly the scaled mass matrix and 

Re: [petsc-dev] Proposed changes to TS API

2018-05-11 Thread Matthew Knepley
On Fri, May 11, 2018 at 12:25 AM, Smith, Barry F. 
wrote:

>
>
> > On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
> >
> > "Zhang, Hong"  writes:
> >
> >> Dear PETSc folks,
> >>
> >> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were
> designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
> >> Shampine's paper (https://www.sciencedirect.com/science/article/pii/
> S0377042706004110?via%3Dihub science/article/pii/S0377042706004110?via=ihub>) explains some reasoning
> behind it.
> >>
> >> Our formulation is general enough to cover all the following common
> cases
> >>
> >>  *   Udot = G(t,U) (classic ODE)
> >>  *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
> >>  *   M(t,U) Udot = G(t,U) (PDEs)
> >>
> >> Yet the TS APIs provide the capability to solve both simple problems
> and complicated problems. However, we are not doing well to make TS easy to
> use and efficient especially for simple problems. Over the years, we have
> clearly seen the main drawbacks including:
> >> 1. The shift parameter exposed in IJacobian is terribly confusing,
> especially to new users. Also it is not conceptually straightforward when
> using AD or finite differences on IFunction to approximate IJacobian.
> >
> > What isn't straightforward about AD or FD on the IFunction?  That one
> > bit of chain rule?
> >
> >> 2. It is difficult to switch from fully implicit to fully explicit.
> Users cannot use explicit methods when they provide IFunction/IJacobian.
> >
> > This is a real issue, but it's extremely common for PDE to have boundary
> > conditions enforced as algebraic constraints, thus yielding a DAE.
> >
> >> 3. The structure of mass matrix is completely invisible to TS. This
> means giving up all the opportunities to improve efficiency. For example,
> when M is constant or weekly dependent on U, we might not want to
> evaluate/update it every time IJacobian is called. If M is diagonal, the
> Jacobian can be shifted more efficiently than just using MatAXPY().
> >
> > I don't understand
> >
> >> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes
> buggy.
> >
> > Why is "reshifting" needed?  After a step is rejected and when the step
> > size changes for a linear constant operator?
> >
> >> Consider the scenario below.
> >> shift = a;
> >>  TSComputeIJacobian()
> >>  shift = b;
> >>  TSComputeIJacobian() // with the same U and Udot as last call
> >> Changing the shift parameter requires the Jacobian function to be
> evaluated again. If users provide only RHSJacobian, the Jacobian will not
> be updated/reshifted in the second call because TSComputeRHSJacobian()
> finds out that U has not been changed. This issue is fixable by adding more
> logic into the already convoluted implementation of TSComputeIJacobian(),
> but the intention here is to illustrate the cumbersomeness of current
> IJacobian and the growing complications in TSComputeIJacobian() that
> IJacobian causes.
> >>
> >> So I propose that we have two separate matrices dF/dUdot and dF/dU, and
> remove the shift parameter from IJacobian. dF/dU will be calculated by
> IJacobian; dF/dUdot will be calculated by a new callback function and
> default to an identity matrix if it is not provided by users. Then the
> users do not need to assemble the shifted Jacobian since TS will handle the
> shifting whenever needed. And knowing the structure of dF/dUdot (the mass
> matrix), TS will become more flexible. In particular, we will have
> >>
> >>  *   easy access to the unshifted Jacobian dF/dU (this simplifies the
> adjoint implementation a lot),
> >
> > How does this simplify the adjoint?
> >
> >>  *   plenty of opportunities to optimize TS when the mass matrix is
> diagonal or constant or weekly dependent on U (which accounts for almost
> all use cases in practice),
> >
> > But longer critical path,
>
>What do you mean by longer critical path?
>
> > more storage required, and more data motion.
>
>   The extra storage needed is related to the size of the mass matrix
> correct? And the extra data motion is related to the size of the mass
> matrix correct?
>
>Is the extra work coming from a needed call to MatAXPY (to combine the
> scaled mass matrix with the Jacobian) in Hong's approach? While, in theory,
> the user can avoid the MatAXPY in the current code if they have custom code
> that assembles directly the scaled mass matrix and Jacobian? But surely
> most users would not write such custom code and would themselves keep a
> copy of the mass matrix (likely constant) and use MatAXPY() to combine the
> copy of the mass matrix with the Jacobian they compute at each
> timestep/stage?  Or am I missing something?


I assemble the combined system directly.

   Matt

> And if the mass matrix is simple, won't it take a very small fraction of
> > time, thus have little gains from "optimizing it"?
>
>Is the Function 

Re: [petsc-dev] Proposed changes to TS API

2018-05-10 Thread Smith, Barry F.


> On May 10, 2018, at 4:12 PM, Jed Brown  wrote:
> 
> "Zhang, Hong"  writes:
> 
>> Dear PETSc folks,
>> 
>> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were designed 
>> for the fully implicit formulation F(t,U,Udot) = G(t,U).
>> Shampine's paper 
>> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
>>  explains some reasoning behind it.
>> 
>> Our formulation is general enough to cover all the following common cases
>> 
>>  *   Udot = G(t,U) (classic ODE)
>>  *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
>>  *   M(t,U) Udot = G(t,U) (PDEs)
>> 
>> Yet the TS APIs provide the capability to solve both simple problems and 
>> complicated problems. However, we are not doing well to make TS easy to use 
>> and efficient especially for simple problems. Over the years, we have 
>> clearly seen the main drawbacks including:
>> 1. The shift parameter exposed in IJacobian is terribly confusing, 
>> especially to new users. Also it is not conceptually straightforward when 
>> using AD or finite differences on IFunction to approximate IJacobian.
> 
> What isn't straightforward about AD or FD on the IFunction?  That one
> bit of chain rule?
> 
>> 2. It is difficult to switch from fully implicit to fully explicit. Users 
>> cannot use explicit methods when they provide IFunction/IJacobian.
> 
> This is a real issue, but it's extremely common for PDE to have boundary
> conditions enforced as algebraic constraints, thus yielding a DAE.
> 
>> 3. The structure of mass matrix is completely invisible to TS. This means 
>> giving up all the opportunities to improve efficiency. For example, when M 
>> is constant or weekly dependent on U, we might not want to evaluate/update 
>> it every time IJacobian is called. If M is diagonal, the Jacobian can be 
>> shifted more efficiently than just using MatAXPY().
> 
> I don't understand 
> 
>> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
> 
> Why is "reshifting" needed?  After a step is rejected and when the step
> size changes for a linear constant operator?
> 
>> Consider the scenario below.
>> shift = a;
>>  TSComputeIJacobian()
>>  shift = b;
>>  TSComputeIJacobian() // with the same U and Udot as last call
>> Changing the shift parameter requires the Jacobian function to be evaluated 
>> again. If users provide only RHSJacobian, the Jacobian will not be 
>> updated/reshifted in the second call because TSComputeRHSJacobian() finds 
>> out that U has not been changed. This issue is fixable by adding more logic 
>> into the already convoluted implementation of TSComputeIJacobian(), but the 
>> intention here is to illustrate the cumbersomeness of current IJacobian and 
>> the growing complications in TSComputeIJacobian() that IJacobian causes.
>> 
>> So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
>> remove the shift parameter from IJacobian. dF/dU will be calculated by 
>> IJacobian; dF/dUdot will be calculated by a new callback function and 
>> default to an identity matrix if it is not provided by users. Then the users 
>> do not need to assemble the shifted Jacobian since TS will handle the 
>> shifting whenever needed. And knowing the structure of dF/dUdot (the mass 
>> matrix), TS will become more flexible. In particular, we will have
>> 
>>  *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
>> adjoint implementation a lot),
> 
> How does this simplify the adjoint?
> 
>>  *   plenty of opportunities to optimize TS when the mass matrix is diagonal 
>> or constant or weekly dependent on U (which accounts for almost all use 
>> cases in practice),
> 
> But longer critical path,

   What do you mean by longer critical path?

> more storage required, and more data motion.

  The extra storage needed is related to the size of the mass matrix correct? 
And the extra data motion is related to the size of the mass matrix correct?

   Is the extra work coming from a needed call to MatAXPY (to combine the 
scaled mass matrix with the Jacobian) in Hong's approach? While, in theory, the 
user can avoid the MatAXPY in the current code if they have custom code that 
assembles directly the scaled mass matrix and Jacobian? But surely most users 
would not write such custom code and would themselves keep a copy of the mass 
matrix (likely constant) and use MatAXPY() to combine the copy of the mass 
matrix with the Jacobian they compute at each timestep/stage?  Or am I missing 
something?

> And if the mass matrix is simple, won't it take a very small fraction of
> time, thus have little gains from "optimizing it"?

   Is the Function approach only theoretically much more efficient than Hong's 
approach when the mass matrix is nontrivial? That is the mass matrix has a 
nonzero structure similar to the Jacobian? 
> 
>>  *   easy 

Re: [petsc-dev] Proposed changes to TS API

2018-05-10 Thread Matthew Knepley
On Thu, May 10, 2018 at 5:12 PM, Jed Brown  wrote:

> "Zhang, Hong"  writes:
>
> > Dear PETSc folks,
> >
> > Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were
> designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
> > Shampine's paper (https://www.sciencedirect.com/science/article/pii/
> S0377042706004110?via%3Dihub science/article/pii/S0377042706004110?via=ihub>) explains some reasoning
> behind it.
> >
> > Our formulation is general enough to cover all the following common cases
> >
> >   *   Udot = G(t,U) (classic ODE)
> >   *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
> >   *   M(t,U) Udot = G(t,U) (PDEs)
> >
> > Yet the TS APIs provide the capability to solve both simple problems and
> complicated problems. However, we are not doing well to make TS easy to use
> and efficient especially for simple problems. Over the years, we have
> clearly seen the main drawbacks including:
> > 1. The shift parameter exposed in IJacobian is terribly confusing,
> especially to new users. Also it is not conceptually straightforward when
> using AD or finite differences on IFunction to approximate IJacobian.
>
> What isn't straightforward about AD or FD on the IFunction?  That one
> bit of chain rule?


Is it automated like SNES?


>
> > 2. It is difficult to switch from fully implicit to fully explicit.
> Users cannot use explicit methods when they provide IFunction/IJacobian.
>
> This is a real issue, but it's extremely common for PDE to have boundary
> conditions enforced as algebraic constraints, thus yielding a DAE.


Good implementations eliminate those variables :)


> > 3. The structure of mass matrix is completely invisible to TS. This
> means giving up all the opportunities to improve efficiency. For example,
> when M is constant or weekly dependent on U, we might not want to
> evaluate/update it every time IJacobian is called. If M is diagonal, the
> Jacobian can be shifted more efficiently than just using MatAXPY().
>
> I don't understand


It definitely isn't a separate entity, but I do not see that it has to be.


> > 4. Reshifting the Jacobian is unnecessarily expensive and sometimes
> buggy.
>
> Why is "reshifting" needed?  After a step is rejected and when the step
> size changes for a linear constant operator?
>
> > Consider the scenario below.
> > shift = a;
> >   TSComputeIJacobian()
> >   shift = b;
> >   TSComputeIJacobian() // with the same U and Udot as last call
> > Changing the shift parameter requires the Jacobian function to be
> evaluated again. If users provide only RHSJacobian, the Jacobian will not
> be updated/reshifted in the second call because TSComputeRHSJacobian()
> finds out that U has not been changed. This issue is fixable by adding more
> logic into the already convoluted implementation of TSComputeIJacobian(),
> but the intention here is to illustrate the cumbersomeness of current
> IJacobian and the growing complications in TSComputeIJacobian() that
> IJacobian causes.
> >
> > So I propose that we have two separate matrices dF/dUdot and dF/dU, and
> remove the shift parameter from IJacobian. dF/dU will be calculated by
> IJacobian; dF/dUdot will be calculated by a new callback function and
> default to an identity matrix if it is not provided by users. Then the
> users do not need to assemble the shifted Jacobian since TS will handle the
> shifting whenever needed. And knowing the structure of dF/dUdot (the mass
> matrix), TS will become more flexible. In particular, we will have
> >
> >   *   easy access to the unshifted Jacobian dF/dU (this simplifies the
> adjoint implementation a lot),
>
> How does this simplify the adjoint?
>
> >   *   plenty of opportunities to optimize TS when the mass matrix is
> diagonal or constant or weekly dependent on U (which accounts for almost
> all use cases in practice),
>
> But longer critical path, more storage required, and more data motion.
> And if the mass matrix is simple, won't it take a very small fraction of
> time, thus have little gains from "optimizing it"?
>

I agree here.

  Matt


> >   *   easy conversion from fully implicit to fully explicit,
> >   *   an IJacobian without shift parameter that is easy to understand
> and easy to port to other software.
>
> Note that even CVODE has an interface similar to PETSc; e.g., gamma
> parameter in CVSpilsPrecSetupFn.
>
> > Regarding the changes on the user side, most IJacobian users should not
> have problem splitting the old Jacobian if they compute dF/dUdot and dF/dU
> explicitly. If dF/dUdot is too complicated to build, matrix-free is an
> alternative option.
> >
> > While this proposal is somehow related to Barry's idea of having a
> TSSetMassMatrix() last year, I hope it provides more details for your
> information. Any of your comments and feedback would be appreciated.
> >
> > Thanks,
> > Hong
>



-- 
What most experimenters take for granted before 

Re: [petsc-dev] Proposed changes to TS API

2018-05-10 Thread Zhang, Hong


On May 10, 2018, at 4:12 PM, Jed Brown 
> wrote:

"Zhang, Hong" > writes:

Dear PETSc folks,

Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were designed for 
the fully implicit formulation F(t,U,Udot) = G(t,U).
Shampine's paper 
(https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
 explains some reasoning behind it.

Our formulation is general enough to cover all the following common cases

 *   Udot = G(t,U) (classic ODE)
 *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
 *   M(t,U) Udot = G(t,U) (PDEs)

Yet the TS APIs provide the capability to solve both simple problems and 
complicated problems. However, we are not doing well to make TS easy to use and 
efficient especially for simple problems. Over the years, we have clearly seen 
the main drawbacks including:
1. The shift parameter exposed in IJacobian is terribly confusing, especially 
to new users. Also it is not conceptually straightforward when using AD or 
finite differences on IFunction to approximate IJacobian.

What isn't straightforward about AD or FD on the IFunction?  That one
bit of chain rule?

Assembling shift*dF/dUdot+dF/dU could mean just a bit extra work for 
experienced users or experts, but could also be fairly annoying for others. 
Given IFunction, an AD tool generates dF/dUdot and dF/dU naturally, but it 
would not do that extra "bit of chain rule" automatically. It is our fault to 
force users to understand or worry about the shift business while it can be 
done easily by PETSc.


2. It is difficult to switch from fully implicit to fully explicit. Users 
cannot use explicit methods when they provide IFunction/IJacobian.

This is a real issue, but it's extremely common for PDE to have boundary
conditions enforced as algebraic constraints, thus yielding a DAE.

Of course the conversion is not always possible. But it is possible for many 
cases if we have the mass matrix. That is one of the reasons why we want to 
have dF/dUdot.


3. The structure of mass matrix is completely invisible to TS. This means 
giving up all the opportunities to improve efficiency. For example, when M is 
constant or weekly dependent on U, we might not want to evaluate/update it 
every time IJacobian is called. If M is diagonal, the Jacobian can be shifted 
more efficiently than just using MatAXPY().

I don't understand

Whenever IJacobian is called, both dF/dUdot (the mass matrix) and dF/dU will be 
computed. Think about the case M Udot = G(t,U) where M is constant. For 
efficiency, we actually just need to update the Jacobian of G during time 
integration and reuse the mass matrix. This is just one of many cases that we 
can take advantage of the structure of the mass matrix, and hiding the mass 
matrix in IJacobian prevents us doing that.


4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.

Why is "reshifting" needed?  After a step is rejected and when the step
size changes for a linear constant operator?

It doesn't not have to be a linear constant operator. Reusing Jacobian across 
stages or even time steps for nonlinear problems is not uncommon.


Consider the scenario below.
shift = a;
 TSComputeIJacobian()
 shift = b;
 TSComputeIJacobian() // with the same U and Udot as last call
Changing the shift parameter requires the Jacobian function to be evaluated 
again. If users provide only RHSJacobian, the Jacobian will not be 
updated/reshifted in the second call because TSComputeRHSJacobian() finds out 
that U has not been changed. This issue is fixable by adding more logic into 
the already convoluted implementation of TSComputeIJacobian(), but the 
intention here is to illustrate the cumbersomeness of current IJacobian and the 
growing complications in TSComputeIJacobian() that IJacobian causes.

So I propose that we have two separate matrices dF/dUdot and dF/dU, and remove 
the shift parameter from IJacobian. dF/dU will be calculated by IJacobian; 
dF/dUdot will be calculated by a new callback function and default to an 
identity matrix if it is not provided by users. Then the users do not need to 
assemble the shifted Jacobian since TS will handle the shifting whenever 
needed. And knowing the structure of dF/dUdot (the mass matrix), TS will become 
more flexible. In particular, we will have

 *   easy access to the unshifted Jacobian dF/dU (this simplifies the adjoint 
implementation a lot),

How does this simplify the adjoint?

For example, when computing the adjoint for M Udot = G(t,U), I often need to 
compute just dG/dU or multiplying M with the adjoint variables. It is doable 
with IJacobian, but tricky and inefficient since 

Re: [petsc-dev] Proposed changes to TS API

2018-05-10 Thread Jed Brown
"Zhang, Hong"  writes:

> Dear PETSc folks,
>
> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were designed 
> for the fully implicit formulation F(t,U,Udot) = G(t,U).
> Shampine's paper 
> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
>  explains some reasoning behind it.
>
> Our formulation is general enough to cover all the following common cases
>
>   *   Udot = G(t,U) (classic ODE)
>   *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
>   *   M(t,U) Udot = G(t,U) (PDEs)
>
> Yet the TS APIs provide the capability to solve both simple problems and 
> complicated problems. However, we are not doing well to make TS easy to use 
> and efficient especially for simple problems. Over the years, we have clearly 
> seen the main drawbacks including:
> 1. The shift parameter exposed in IJacobian is terribly confusing, especially 
> to new users. Also it is not conceptually straightforward when using AD or 
> finite differences on IFunction to approximate IJacobian.

What isn't straightforward about AD or FD on the IFunction?  That one
bit of chain rule?

> 2. It is difficult to switch from fully implicit to fully explicit. Users 
> cannot use explicit methods when they provide IFunction/IJacobian.

This is a real issue, but it's extremely common for PDE to have boundary
conditions enforced as algebraic constraints, thus yielding a DAE.

> 3. The structure of mass matrix is completely invisible to TS. This means 
> giving up all the opportunities to improve efficiency. For example, when M is 
> constant or weekly dependent on U, we might not want to evaluate/update it 
> every time IJacobian is called. If M is diagonal, the Jacobian can be shifted 
> more efficiently than just using MatAXPY().

I don't understand 

> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.

Why is "reshifting" needed?  After a step is rejected and when the step
size changes for a linear constant operator?

> Consider the scenario below.
> shift = a;
>   TSComputeIJacobian()
>   shift = b;
>   TSComputeIJacobian() // with the same U and Udot as last call
> Changing the shift parameter requires the Jacobian function to be evaluated 
> again. If users provide only RHSJacobian, the Jacobian will not be 
> updated/reshifted in the second call because TSComputeRHSJacobian() finds out 
> that U has not been changed. This issue is fixable by adding more logic into 
> the already convoluted implementation of TSComputeIJacobian(), but the 
> intention here is to illustrate the cumbersomeness of current IJacobian and 
> the growing complications in TSComputeIJacobian() that IJacobian causes.
>
> So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
> remove the shift parameter from IJacobian. dF/dU will be calculated by 
> IJacobian; dF/dUdot will be calculated by a new callback function and default 
> to an identity matrix if it is not provided by users. Then the users do not 
> need to assemble the shifted Jacobian since TS will handle the shifting 
> whenever needed. And knowing the structure of dF/dUdot (the mass matrix), TS 
> will become more flexible. In particular, we will have
>
>   *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
> adjoint implementation a lot),

How does this simplify the adjoint?

>   *   plenty of opportunities to optimize TS when the mass matrix is diagonal 
> or constant or weekly dependent on U (which accounts for almost all use cases 
> in practice),

But longer critical path, more storage required, and more data motion.
And if the mass matrix is simple, won't it take a very small fraction of
time, thus have little gains from "optimizing it"?

>   *   easy conversion from fully implicit to fully explicit,
>   *   an IJacobian without shift parameter that is easy to understand and 
> easy to port to other software.

Note that even CVODE has an interface similar to PETSc; e.g., gamma
parameter in CVSpilsPrecSetupFn.

> Regarding the changes on the user side, most IJacobian users should not have 
> problem splitting the old Jacobian if they compute dF/dUdot and dF/dU 
> explicitly. If dF/dUdot is too complicated to build, matrix-free is an 
> alternative option.
>
> While this proposal is somehow related to Barry's idea of having a 
> TSSetMassMatrix() last year, I hope it provides more details for your 
> information. Any of your comments and feedback would be appreciated.
>
> Thanks,
> Hong


[petsc-dev] Proposed changes to TS API

2018-05-10 Thread Zhang, Hong
Dear PETSc folks,

Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were designed for 
the fully implicit formulation F(t,U,Udot) = G(t,U).
Shampine's paper 
(https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub)
 explains some reasoning behind it.

Our formulation is general enough to cover all the following common cases

  *   Udot = G(t,U) (classic ODE)
  *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
  *   M(t,U) Udot = G(t,U) (PDEs)

Yet the TS APIs provide the capability to solve both simple problems and 
complicated problems. However, we are not doing well to make TS easy to use and 
efficient especially for simple problems. Over the years, we have clearly seen 
the main drawbacks including:
1. The shift parameter exposed in IJacobian is terribly confusing, especially 
to new users. Also it is not conceptually straightforward when using AD or 
finite differences on IFunction to approximate IJacobian.
2. It is difficult to switch from fully implicit to fully explicit. Users 
cannot use explicit methods when they provide IFunction/IJacobian.
3. The structure of mass matrix is completely invisible to TS. This means 
giving up all the opportunities to improve efficiency. For example, when M is 
constant or weekly dependent on U, we might not want to evaluate/update it 
every time IJacobian is called. If M is diagonal, the Jacobian can be shifted 
more efficiently than just using MatAXPY().
4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
Consider the scenario below.
shift = a;
  TSComputeIJacobian()
  shift = b;
  TSComputeIJacobian() // with the same U and Udot as last call
Changing the shift parameter requires the Jacobian function to be evaluated 
again. If users provide only RHSJacobian, the Jacobian will not be 
updated/reshifted in the second call because TSComputeRHSJacobian() finds out 
that U has not been changed. This issue is fixable by adding more logic into 
the already convoluted implementation of TSComputeIJacobian(), but the 
intention here is to illustrate the cumbersomeness of current IJacobian and the 
growing complications in TSComputeIJacobian() that IJacobian causes.

So I propose that we have two separate matrices dF/dUdot and dF/dU, and remove 
the shift parameter from IJacobian. dF/dU will be calculated by IJacobian; 
dF/dUdot will be calculated by a new callback function and default to an 
identity matrix if it is not provided by users. Then the users do not need to 
assemble the shifted Jacobian since TS will handle the shifting whenever 
needed. And knowing the structure of dF/dUdot (the mass matrix), TS will become 
more flexible. In particular, we will have

  *   easy access to the unshifted Jacobian dF/dU (this simplifies the adjoint 
implementation a lot),
  *   plenty of opportunities to optimize TS when the mass matrix is diagonal 
or constant or weekly dependent on U (which accounts for almost all use cases 
in practice),
  *   easy conversion from fully implicit to fully explicit,
  *   an IJacobian without shift parameter that is easy to understand and easy 
to port to other software.

Regarding the changes on the user side, most IJacobian users should not have 
problem splitting the old Jacobian if they compute dF/dUdot and dF/dU 
explicitly. If dF/dUdot is too complicated to build, matrix-free is an 
alternative option.

While this proposal is somehow related to Barry's idea of having a 
TSSetMassMatrix() last year, I hope it provides more details for your 
information. Any of your comments and feedback would be appreciated.

Thanks,
Hong