> On May 11, 2018, at 8:03 AM, Stefano Zampini <stefano.zamp...@gmail.com> 
> wrote:
> 
> I don’t think changing the current TS API is best approach.
> 
> Obtaining separate Jacobians is a need for adjoints and tangent linear models 
> only.
> This is how I implemented it in stefano_zampini/feature-continuousadjoint
> https://bitbucket.org/petsc/petsc/src/7203629c61cbeced536ed8ba1dd2ef85ffb89e8f/src/ts/interface/tssplitjac.c#lines-48
> 
> Note that, instead of requiring the user to call PetscObjectComposeFunction, 
> we can use a function pointer and have TSSetComputeSplitJacobians

   So you are proposing keeping TSSetIFunction and TSSetIJacobian and ADDING a 
new API TSSetComputeSplitJacobians() and it that is not provided calling 
TSComputeIJacobian() twice with different shifts (which is definitely not 
efficient and is what Hong does also).

   Barry

> 
> 
> 
>> On May 11, 2018, at 3:20 PM, Jed Brown <j...@jedbrown.org> wrote:
>> 
>> "Smith, Barry F." <bsm...@mcs.anl.gov> writes:
>> 
>>>> On May 10, 2018, at 4:12 PM, Jed Brown <j...@jedbrown.org> wrote:
>>>> 
>>>> "Zhang, Hong" <hongzh...@anl.gov> writes:
>>>> 
>>>>> Dear PETSc folks,
>>>>> 
>>>>> Current TS APIs (IFunction/IJacobian+RHSFunction/RHSJacobian) were 
>>>>> designed for the fully implicit formulation F(t,U,Udot) = G(t,U).
>>>>> Shampine's paper 
>>>>> (https://www.sciencedirect.com/science/article/pii/S0377042706004110?via%3Dihub<https://www.sciencedirect.com/science/article/pii/S0377042706004110?via=ihub>)
>>>>>  explains some reasoning behind it.
>>>>> 
>>>>> Our formulation is general enough to cover all the following common cases
>>>>> 
>>>>> *   Udot = G(t,U) (classic ODE)
>>>>> *   M Udot = G(t,U)  (ODEs/DAEs for mechanical and electronic systems)
>>>>> *   M(t,U) Udot = G(t,U) (PDEs)
>>>>> 
>>>>> Yet the TS APIs provide the capability to solve both simple problems and 
>>>>> complicated problems. However, we are not doing well to make TS easy to 
>>>>> use and efficient especially for simple problems. Over the years, we have 
>>>>> clearly seen the main drawbacks including:
>>>>> 1. The shift parameter exposed in IJacobian is terribly confusing, 
>>>>> especially to new users. Also it is not conceptually straightforward when 
>>>>> using AD or finite differences on IFunction to approximate IJacobian.
>>>> 
>>>> What isn't straightforward about AD or FD on the IFunction?  That one
>>>> bit of chain rule?
>>>> 
>>>>> 2. It is difficult to switch from fully implicit to fully explicit. Users 
>>>>> cannot use explicit methods when they provide IFunction/IJacobian.
>>>> 
>>>> This is a real issue, but it's extremely common for PDE to have boundary
>>>> conditions enforced as algebraic constraints, thus yielding a DAE.
>>>> 
>>>>> 3. The structure of mass matrix is completely invisible to TS. This means 
>>>>> giving up all the opportunities to improve efficiency. For example, when 
>>>>> M is constant or weekly dependent on U, we might not want to 
>>>>> evaluate/update it every time IJacobian is called. If M is diagonal, the 
>>>>> Jacobian can be shifted more efficiently than just using MatAXPY().
>>>> 
>>>> I don't understand 
>>>> 
>>>>> 4. Reshifting the Jacobian is unnecessarily expensive and sometimes buggy.
>>>> 
>>>> Why is "reshifting" needed?  After a step is rejected and when the step
>>>> size changes for a linear constant operator?
>>>> 
>>>>> Consider the scenario below.
>>>>> shift = a;
>>>>> TSComputeIJacobian()
>>>>> shift = b;
>>>>> TSComputeIJacobian() // with the same U and Udot as last call
>>>>> Changing the shift parameter requires the Jacobian function to be 
>>>>> evaluated again. If users provide only RHSJacobian, the Jacobian will not 
>>>>> be updated/reshifted in the second call because TSComputeRHSJacobian() 
>>>>> finds out that U has not been changed. This issue is fixable by adding 
>>>>> more logic into the already convoluted implementation of 
>>>>> TSComputeIJacobian(), but the intention here is to illustrate the 
>>>>> cumbersomeness of current IJacobian and the growing complications in 
>>>>> TSComputeIJacobian() that IJacobian causes.
>>>>> 
>>>>> So I propose that we have two separate matrices dF/dUdot and dF/dU, and 
>>>>> remove the shift parameter from IJacobian. dF/dU will be calculated by 
>>>>> IJacobian; dF/dUdot will be calculated by a new callback function and 
>>>>> default to an identity matrix if it is not provided by users. Then the 
>>>>> users do not need to assemble the shifted Jacobian since TS will handle 
>>>>> the shifting whenever needed. And knowing the structure of dF/dUdot (the 
>>>>> mass matrix), TS will become more flexible. In particular, we will have
>>>>> 
>>>>> *   easy access to the unshifted Jacobian dF/dU (this simplifies the 
>>>>> adjoint implementation a lot),
>>>> 
>>>> How does this simplify the adjoint?
>>>> 
>>>>> *   plenty of opportunities to optimize TS when the mass matrix is 
>>>>> diagonal or constant or weekly dependent on U (which accounts for almost 
>>>>> all use cases in practice),
>>>> 
>>>> But longer critical path,
>>> 
>>>   What do you mean by longer critical path?
>> 
>> Create Mass (dF/dUdot) matrix, call MatAssembly, create dF/dU, call
>> MatAssembly, call MatAXPY (involves another MatAssembly unless
>> SAME_NONZERO_PATTERN).  That's a long sequence for what could be one
>> MatAssembly.  Also note that geometric setup for elements is usually
>> done in each element loop.  For simple physics, this is way more
>> expensive than the physics (certainly the case for LibMesh and Deal.II).
>> 
>>>> more storage required, and more data motion.
>>> 
>>>  The extra storage needed is related to the size of the mass matrix 
>>> correct? And the extra data motion is related to the size of the mass 
>>> matrix correct?
>> 
>> Yes, which is the same as the stiffness matrix for finite element methods.
>> 
>>>   Is the extra work coming from a needed call to MatAXPY (to combine the 
>>> scaled mass matrix with the Jacobian) in Hong's approach? While, in theory, 
>>> the user can avoid the MatAXPY in the current code if they have custom code 
>>> that assembles directly the scaled mass matrix and Jacobian? But surely 
>>> most users would not write such custom code and would themselves keep a 
>>> copy of the mass matrix (likely constant) and use MatAXPY() to combine the 
>>> copy of the mass matrix with the Jacobian they compute at each 
>>> timestep/stage?  Or am I missing something?
>> 
>> It's better (and at least as convenient) to write code that assembles
>> into the mass matrix (at the element scale if you want to amortize mesh
>> traversal, but can also be a new traversal without needing extra
>> MatAssembly).  Then instead of MatAXPY(), you call the code that
>> ADD_VALUES the mass part.  I think storing the mass matrix away
>> somewhere is a false economy in almost all cases.
>> 
>> There is also the issue that matrix-free preconditioning is much more
>> confusing with the new proposed scheme.  As it is now, the matrix needed
>> by the solver is specified and the user can choose how to approximate
>> it.  If only the pieces are specified, then a PCShell will need to
>> understand the result of a MatAXPY with shell matrices.
>> 
>>>> And if the mass matrix is simple, won't it take a very small fraction of
>>>> time, thus have little gains from "optimizing it"?
>>> 
>>>   Is the Function approach only theoretically much more efficient than 
>>> Hong's approach when the mass matrix is nontrivial? That is the mass matrix 
>>> has a nonzero structure similar to the Jacobian? 
>> 
>> The extra MatAssembly is called even for MatAXPY with
>> SUBSET_NONZERO_PATTERN.  But apart from strong scaling concerns like
>> that extra communication (which could be optimized, but there are
>> several formats to optimize) any system should be sufficiently fast if
>> the mass matrix is trivial because that means it has much less work than
>> the dF/dU matrix.
>> 
>>>>> *   easy conversion from fully implicit to fully explicit,
>>>>> *   an IJacobian without shift parameter that is easy to understand and 
>>>>> easy to port to other software.
>>>> 
>>>> Note that even CVODE has an interface similar to PETSc; e.g., gamma
>>>> parameter in CVSpilsPrecSetupFn.
>>>> 
>>>>> Regarding the changes on the user side, most IJacobian users should not 
>>>>> have problem splitting the old Jacobian if they compute dF/dUdot and 
>>>>> dF/dU explicitly. If dF/dUdot is too complicated to build, matrix-free is 
>>>>> an alternative option.
>>>>> 
>>>>> While this proposal is somehow related to Barry's idea of having a 
>>>>> TSSetMassMatrix() last year, I hope it provides more details for your 
>>>>> information. Any of your comments and feedback would be appreciated.
>>>>> 
>>>>> Thanks,
>>>>> Hong
> 

Reply via email to