TS: high-order with Galerkin spatial discretization, semi-explicit index-1 DAE

2009-03-20 Thread Jed Brown
On Thu 2009-03-19 08:04, Matthew Knepley wrote:
 Okay, I understand now. You are correct that this formulation should
 be available.  In fact, I think it should be possible to make this the
 bottom of the hierarchy, with the linear case a specialization, and
 the identity case a specialization of that (with a MF
 application). This is a good chance to redesign TS.

Great.  I also prefer having the nonlinear DAE case be the bottom of the
hierarchy.  I think most specializations (and backward compatibility
with the current interface, if desired) could be achieved with thin
wrappers similar in spirit to SNESDAFormFunction.


The interface I have in mind is quite similar to IDA.  The DAE/ODE is
written in the form

  F(x_t,x,t) = 0

With an implicit system, the nonlinear equation that needs to be solved is

  G(x) = F(y + a x, x, t) = 0

(y is known and specific to the method) which has Jacobian

  J(x) = dG/dx = a dF/dx_t + dF/dx


The decision in IDA, translated to PETSc conventions, is that the user
provides F

  PetscErrorCode TSFormFunction_1(TS,Vec x_t,Vec x,PetscReal t,Vec f,void*);

and J(a,x_t,x,t)

  PetscErrorCode TSFormJacobian_1(TS,PetscReal a,Vec x_t,Vec x,PetscReal t,Mat 
jac,Mat jpre,MatStructure*,void*);

An alternative would be to provide G and G' as

  PetscErrorCode TSFormFunction_2(TS,PetscReal a,Vec y,Vec x,PetscReal t,Vec 
f,void*);
  PetscErrorCode TSFormJacobian_2(TS,PetscReal a,Vec y,Vec x,PetscReal t,Mat 
jac,Mat jpre,MatStructure*,void*);

in which case the SNES wrappers would not need to do the VecWAXPY
implied by interface 1.  I don't know if there is a compelling reason
not to choose interface 2.

Explicit methods can use the same function evaluation (calling with
a=0), but they need a wrapper when used with a mass matrix other than
the identity (which only makes sense if it's trivial to solve with,
e.g. block diagonal as with DG).  The wrapper would just turn

  M x_t = - F(0,x,t)

into

  x_t = G(x,t) = - M^{-1} F(0,x,t)




TS: high-order with Galerkin spatial discretization, semi-explicit index-1 DAE

2009-03-19 Thread Jed Brown
On Wed 2009-03-18 13:59, Matthew Knepley wrote:
 I cannot understand why you cannot use TSSetMatrices()

TSSetMatrices is only for linear problems and doesn't generalize nicely
to nonlinear problems, especially with matrix-free methods.  The
functionality I'm looking for involves a possibly variable, possibly
singular (with huge null space for DAE), possibly nonlinear (as with fully
implicit moving mesh schemes) Alhs.  With my high-order Galerkin scheme,
Alhs can only be available as MatShell (so MatAXPY is hard to implement)
because an assembled Alhs would be an order of magnitude denser than the
preconditioning matrix associated with the unshifted Jacobian (J=F' if
we are solving Alhs(u,t) u_t = F(u,t)).

This can all be handled in a way that would work nicely with JFNK and
avoid storing Alhs in any form, but I think the user's function
evaluation and matrix assembly needs to provide shifted versions (such
as 1/dt*Alhs-J where J = F'(u)).  The shift is usually a trivial amount
of work during function evaluation and matrix assembly.  Preconditioners
can still be lagged effectively as long as the shift does not change
rapidly.  Even if the shift does change rapidly, reassembling the
preconditioning matrix could still often be avoided by just doing a
lumped shift (function evaluation and matrix-free Jacobians can change
the shift, using a consistent mass matrix, at zero cost).

Is this more clear?

Jed
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20090318/220d63eb/attachment.pgp


TS: high-order with Galerkin spatial discretization, semi-explicit index-1 DAE

2009-03-19 Thread Jed Brown
On Wed 2009-03-18 18:34, Matthew Knepley wrote:
 Not a whole lot. Needs to go slower, since I think there are a bunch
 of embedded assumptions, and I worry that the most simple, general
 interface cannot be seen until they are clarified. I want it spelled
 out very very explicitly. So, to begin, you are unhappy with
 
   A_lhs(t) u_t = F(u,t)
 
 where A_lhs is a linear operator. Is this true?

Yes, I agree that this would be the natural generalization of the
existing design for linear problems.  I think it is not a good option
for several reasons, most notably that it will often double the required
storage and/or double the required computation per time-step when used
with matrix-free methods.  Also see the end of this message for the case
of general index-1 DAE where it is not possible to separate A_lhs from F.

 For now, I will assume that A_lhs(t) is actually a nonlinear operator O_lhs
 (u,u_t,t). Then
 we have
 
   O_lhs(u,u_t,t) - F(u,t) = 0
 
 So if we do Backward-Euler,
 
   O_lhs(u^n+1,t^n+1) - dt F(u^n+1,t^n+1) - O_lhs(u^n,t^n) = G = 0
 
 so for Newton
 
   (O'_lhs - dt F') du = G
 
 So are you asking for a nonlinear operator O?

Yes, so shifted function evaluation is

  O = O_lhs - dt F

and the shifted Jacobian is

  O' = O'_lhs - dt F' .

My proposal is that there are significant space- and time-savings if the
user can supply O and O' for an arbitrary shift dt, instead of O_lhs,
O'_lhs, F, F'.  For most problems, computing O and O' is essentially the
same amount of work as F and F', and computing O_lhs, O'_lhs may require
a similar amount of work (in my case) and often requires a similar
amount of storage (the preconditioning matrix for O'_lhs often has the
same nonzero pattern as F').  The purpose of O_lhs, O'_lhs are only to
compute shifts and are never used alone (indeed cannot be used alone for
DAE).

Use of O, O' generalizes to high-order multistep and multistage implicit
methods.

 After I understand the answers to these, I can start to understand the
 comments about shifts, etc. that talk about implementing this new
 model.
 
 I just want to clarify the movement away from the original TS model.

Yes, of course.

My motivation comes from integrating DAE using a moving-mesh scheme
where the spatial discretization does not assemble the real Jacobian,
only a much sparser (factor 10--100) preconditioning matrix.  Evaluating
the action of the Jacobian is less expensive than function evaluation
only because transcendental functions don't need to be reevaluated, and
much of the cost comes from evaluating tensor-product operations at
quadrature points.  Keeping O_lhs and F separate means that application
of O' (this is what is actually needed by the integrator) needs to be
applied by applying O'_lhs and F' separately, essentially doubling the
cost (this is also the case if O'_lhs is available via MFFD).  If O'_lhs
and F' are assembled, it doubles the required storage and may double the
assembly time.

I think the advantages of keeping O_lhs separate is only realized when
it is constant and F' is assembled, or when O_lhs is diagonal.

CVODE (Sundials) requires that O_lhs is the identity (there is no way to
change it) while IDA (the DAE solver) essentially asks the user for the
shifted versions.  Indeed, for general index-1 DAE G(u_t,u,t)=0, it is
not possible to separate O_lhs and F.  There is probably no advantage to
restricting ourselves to the semi-explicit case.  Maybe it's clearer in
the context of general index-1 DAE, equations of the form

  F(u_t, u, t) = 0 .

Solution using multistep (e.g. BDF) and multistage (Implicit
Runge-Kutta) methods requires solving algebraic systems of the form

  G(u) = F(known + alpha u, u, t)

The required Jacobian is

  G'(u)  = alpha dF/du_t + dF/du .

These are analogous to the shifted operators I've been talking about,
but apply in more general circumstances.

Jed
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20090319/9c197403/attachment.pgp


TS: high-order with Galerkin spatial discretization, semi-explicit index-1 DAE

2009-03-19 Thread Matthew Knepley
Okay, I understand now. You are correct that this formulation should be
available.
In fact, I think it should be possible to make this the bottom of the
hierarchy, with
the linear case a specialization, and the identity case a specialization of
that (with
a MF application). This is a good chance to redesign TS.

Next question. How should we do it? We can continue the discussion on email,
but
that makes it hard to refer back to pieces of code, or suggested interfaces.
I think
for a collaborative job, we need to augment the email discussion with a Wiki
of some
type (I was very impressed with Tim Gowers' polymath experiment).

Satish, do we have a Wiki, or could we setup one in short order?

  Thanks,

 Matt

On Thu, Mar 19, 2009 at 5:28 AM, Jed Brown jed at 59a2.org wrote:

 On Wed 2009-03-18 18:34, Matthew Knepley wrote:
  Not a whole lot. Needs to go slower, since I think there are a bunch
  of embedded assumptions, and I worry that the most simple, general
  interface cannot be seen until they are clarified. I want it spelled
  out very very explicitly. So, to begin, you are unhappy with
 
A_lhs(t) u_t = F(u,t)
 
  where A_lhs is a linear operator. Is this true?

 Yes, I agree that this would be the natural generalization of the
 existing design for linear problems.  I think it is not a good option
 for several reasons, most notably that it will often double the required
 storage and/or double the required computation per time-step when used
 with matrix-free methods.  Also see the end of this message for the case
 of general index-1 DAE where it is not possible to separate A_lhs from F.

  For now, I will assume that A_lhs(t) is actually a nonlinear operator
 O_lhs
  (u,u_t,t). Then
  we have
 
O_lhs(u,u_t,t) - F(u,t) = 0
 
  So if we do Backward-Euler,
 
O_lhs(u^n+1,t^n+1) - dt F(u^n+1,t^n+1) - O_lhs(u^n,t^n) = G = 0
 
  so for Newton
 
(O'_lhs - dt F') du = G
 
  So are you asking for a nonlinear operator O?

 Yes, so shifted function evaluation is

  O = O_lhs - dt F

 and the shifted Jacobian is

  O' = O'_lhs - dt F' .

 My proposal is that there are significant space- and time-savings if the
 user can supply O and O' for an arbitrary shift dt, instead of O_lhs,
 O'_lhs, F, F'.  For most problems, computing O and O' is essentially the
 same amount of work as F and F', and computing O_lhs, O'_lhs may require
 a similar amount of work (in my case) and often requires a similar
 amount of storage (the preconditioning matrix for O'_lhs often has the
 same nonzero pattern as F').  The purpose of O_lhs, O'_lhs are only to
 compute shifts and are never used alone (indeed cannot be used alone for
 DAE).

 Use of O, O' generalizes to high-order multistep and multistage implicit
 methods.

  After I understand the answers to these, I can start to understand the
  comments about shifts, etc. that talk about implementing this new
  model.
 
  I just want to clarify the movement away from the original TS model.

 Yes, of course.

 My motivation comes from integrating DAE using a moving-mesh scheme
 where the spatial discretization does not assemble the real Jacobian,
 only a much sparser (factor 10--100) preconditioning matrix.  Evaluating
 the action of the Jacobian is less expensive than function evaluation
 only because transcendental functions don't need to be reevaluated, and
 much of the cost comes from evaluating tensor-product operations at
 quadrature points.  Keeping O_lhs and F separate means that application
 of O' (this is what is actually needed by the integrator) needs to be
 applied by applying O'_lhs and F' separately, essentially doubling the
 cost (this is also the case if O'_lhs is available via MFFD).  If O'_lhs
 and F' are assembled, it doubles the required storage and may double the
 assembly time.

 I think the advantages of keeping O_lhs separate is only realized when
 it is constant and F' is assembled, or when O_lhs is diagonal.

 CVODE (Sundials) requires that O_lhs is the identity (there is no way to
 change it) while IDA (the DAE solver) essentially asks the user for the
 shifted versions.  Indeed, for general index-1 DAE G(u_t,u,t)=0, it is
 not possible to separate O_lhs and F.  There is probably no advantage to
 restricting ourselves to the semi-explicit case.  Maybe it's clearer in
 the context of general index-1 DAE, equations of the form

  F(u_t, u, t) = 0 .

 Solution using multistep (e.g. BDF) and multistage (Implicit
 Runge-Kutta) methods requires solving algebraic systems of the form

  G(u) = F(known + alpha u, u, t)

 The required Jacobian is

  G'(u)  = alpha dF/du_t + dF/du .

 These are analogous to the shifted operators I've been talking about,
 but apply in more general circumstances.

 Jed




-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-- next part --
An HTML attachment was scrubbed...

TS: high-order with Galerkin spatial discretization, semi-explicit index-1 DAE

2009-03-18 Thread Matthew Knepley
I could not understand the post. We will have to go more slowly. To begin,
I cannot understand why you cannot use TSSetMatrices()


http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/TS/TSSetMatrices.html#TSSetMatrices

  Thanks,

 Matt

On Tue, Mar 17, 2009 at 9:15 PM, Jed Brown jed at 59a2.org wrote:

 I have a use for high-order implicit strong stability preserving
 integrators and plan to write a TS implementation based on some known
 optimal methods (cf. Gottlieb, Ketcheson, Shu 2009).  There are two
 cases of interest that I don't think TS can currently handle.

 Suppose we are using a Galerkin discretization so that the mass matrix
 is not the identity, but we still want to use high-order temporal
 discretization.  In particular, for high-order spatial discretizations,
 lumping the mass matrix is usually not acceptable so it would have dense
 inverse.  In these cases, MatShift is not sufficient [*].  One option is
 to allow shifts by an assembled mass matrix, but this is awkward with
 JFNK and requires significant storage when the mass matrix is not
 lumped.

 Perhaps a better option is for function evaluation and Jacobian assembly
 to observe the time-step (or a_{kk}*dt for Runge-Kutta methods).  This
 also works in the more general setting of semi-explicit index-1 DAE,
 presented in the form

  V u_t = F(u,t)

 where F is the spatial discretization and V is typically block diagonal
 with mass matrices for differential variables and zero for algebraic
 variables.  To integrate such systems, the user would provide functions
 to evaluate

  G(u,u_i,t,dt) = V u - F(dt*u+u_i,t)

 (u_i is a known inhomogeneous part) and assemble the
 Jacobian/preconditioner G'(u,t,dt).  This is sufficient for multistep
 and Diagonally Implicit Runge-Kutta (DIRK) methods.  For example, in the
 case of DIRK, stage k is computed by solving

  G(u^k, u_0 + dt \sum_{l=0}^{k-1} a_{kl} u^l, t^k, dt a_{kk}) = 0

 for u^k.  For singly implicit Runge-Kutta (SIRK, a_{kk} constant)
 methods, lagging the preconditioner is perfectly acceptable since the
 time-step does not change frequently.  For JFNK with non-SIRK methods
 and with high-order multistep methods, the preconditioner would often
 have to be refactored to accommodate the shift, hence the inability to
 effectively lag the preconditioner is an unavoidable weakness.  Of
 course it is common that assembly is more expensive than PC setup in
 which case assembling F' and shifting by the mass matrix is appealing
 (assuming that storage for an assembled mass matrix is available).

 To summarize, asking the user to provide G,G' has the pros:
  + consistent mass matrices with minimal cost
  + DAE
  + moving meshes
 and cons:
  - not compatible with current F,F'
  - shifts requiring reassembly are more expensive

 Asking the user to observe the time-step is at odds with the current
 interface where the user only provides F and F'.  Are the
 performance/flexibility benefits worth it?  How about a middle ground
 where the application can provide a lumped mass matrix (identity by
 default) and either functions F,F' or G,G' (where if G' is provided and
 the preconditioning matrix is different from the Jacobian, the
 preconditioner would be shifted using the lumped mass matrix)?

 I'd love to hear comments or suggestions.


 Jed


 [*] For my purposes, I don't actually need an assembled consistent mass
 matrix because my Jacobian is available via MatShell or MFFD (i.e. since
 shifting by the lumped mass matrix still produces a good preconditioner,
 the generalization of MatShift would be sufficient).  However, some
 people use the same matrix for the Jacobian and for preconditioning,
 hence a general scheme probably ought to be able to handle this.




-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-- next part --
An HTML attachment was scrubbed...
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20090318/b8e57f5e/attachment.html


TS: high-order with Galerkin spatial discretization, semi-explicit index-1 DAE

2009-03-18 Thread Matthew Knepley
On Wed, Mar 18, 2009 at 5:27 PM, Jed Brown jed at 59a2.org wrote:

 On Wed 2009-03-18 13:59, Matthew Knepley wrote:
  I cannot understand why you cannot use TSSetMatrices()

 TSSetMatrices is only for linear problems and doesn't generalize nicely
 to nonlinear problems, especially with matrix-free methods.  The
 functionality I'm looking for involves a possibly variable, possibly
 singular (with huge null space for DAE), possibly nonlinear (as with fully
 implicit moving mesh schemes) Alhs.  With my high-order Galerkin scheme,
 Alhs can only be available as MatShell (so MatAXPY is hard to implement)
 because an assembled Alhs would be an order of magnitude denser than the
 preconditioning matrix associated with the unshifted Jacobian (J=F' if
 we are solving Alhs(u,t) u_t = F(u,t)).


 This can all be handled in a way that would work nicely with JFNK and
 avoid storing Alhs in any form, but I think the user's function
 evaluation and matrix assembly needs to provide shifted versions (such
 as 1/dt*Alhs-J where J = F'(u)).  The shift is usually a trivial amount
 of work during function evaluation and matrix assembly.  Preconditioners
 can still be lagged effectively as long as the shift does not change
 rapidly.  Even if the shift does change rapidly, reassembling the
 preconditioning matrix could still often be avoided by just doing a
 lumped shift (function evaluation and matrix-free Jacobians can change
 the shift, using a consistent mass matrix, at zero cost).

 Is this more clear?


Not a whole lot. Needs to go slower, since I think there are a bunch of
embedded
assumptions, and I worry that the most simple, general interface cannot be
seen
until they are clarified. I want it spelled out very very explicitly. So, to
begin, you
are unhappy with

  A_lhs(t) u_t = F(u,t)

where A_lhs is a linear operator. Is this true?

For now, I will assume that A_lhs(t) is actually a nonlinear operator
O_lhs(u,u_t,t). Then
we have

  O_lhs(u,u_t,t) - F(u,t) = 0

So if we do Backward-Euler,

  O_lhs(u^n+1,t^n+1) - dt F(u^n+1,t^n+1) - O_lhs(u^n,t^n) = G = 0

so for Newton

  (O'_lhs - dt F') du = G

So are you asking for a nonlinear operator O?

After I understand the answers to these, I can start to understand the
comments about
shifts, etc. that talk about implementing this new model.

I just want to clarify the movement away from the original TS model.

  Matt


 Jed

-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-- next part --
An HTML attachment was scrubbed...
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20090318/a231dffd/attachment.html


TS: high-order with Galerkin spatial discretization, semi-explicit index-1 DAE

2009-03-18 Thread Jed Brown
I have a use for high-order implicit strong stability preserving
integrators and plan to write a TS implementation based on some known
optimal methods (cf. Gottlieb, Ketcheson, Shu 2009).  There are two
cases of interest that I don't think TS can currently handle.

Suppose we are using a Galerkin discretization so that the mass matrix
is not the identity, but we still want to use high-order temporal
discretization.  In particular, for high-order spatial discretizations,
lumping the mass matrix is usually not acceptable so it would have dense
inverse.  In these cases, MatShift is not sufficient [*].  One option is
to allow shifts by an assembled mass matrix, but this is awkward with
JFNK and requires significant storage when the mass matrix is not
lumped.

Perhaps a better option is for function evaluation and Jacobian assembly
to observe the time-step (or a_{kk}*dt for Runge-Kutta methods).  This
also works in the more general setting of semi-explicit index-1 DAE,
presented in the form

  V u_t = F(u,t)

where F is the spatial discretization and V is typically block diagonal
with mass matrices for differential variables and zero for algebraic
variables.  To integrate such systems, the user would provide functions
to evaluate

  G(u,u_i,t,dt) = V u - F(dt*u+u_i,t)

(u_i is a known inhomogeneous part) and assemble the
Jacobian/preconditioner G'(u,t,dt).  This is sufficient for multistep
and Diagonally Implicit Runge-Kutta (DIRK) methods.  For example, in the
case of DIRK, stage k is computed by solving

  G(u^k, u_0 + dt \sum_{l=0}^{k-1} a_{kl} u^l, t^k, dt a_{kk}) = 0

for u^k.  For singly implicit Runge-Kutta (SIRK, a_{kk} constant)
methods, lagging the preconditioner is perfectly acceptable since the
time-step does not change frequently.  For JFNK with non-SIRK methods
and with high-order multistep methods, the preconditioner would often
have to be refactored to accommodate the shift, hence the inability to
effectively lag the preconditioner is an unavoidable weakness.  Of
course it is common that assembly is more expensive than PC setup in
which case assembling F' and shifting by the mass matrix is appealing
(assuming that storage for an assembled mass matrix is available).

To summarize, asking the user to provide G,G' has the pros:
  + consistent mass matrices with minimal cost
  + DAE
  + moving meshes
and cons:
  - not compatible with current F,F'
  - shifts requiring reassembly are more expensive

Asking the user to observe the time-step is at odds with the current
interface where the user only provides F and F'.  Are the
performance/flexibility benefits worth it?  How about a middle ground
where the application can provide a lumped mass matrix (identity by
default) and either functions F,F' or G,G' (where if G' is provided and
the preconditioning matrix is different from the Jacobian, the
preconditioner would be shifted using the lumped mass matrix)?

I'd love to hear comments or suggestions.


Jed


[*] For my purposes, I don't actually need an assembled consistent mass
matrix because my Jacobian is available via MatShell or MFFD (i.e. since
shifting by the lumped mass matrix still produces a good preconditioner,
the generalization of MatShift would be sufficient).  However, some
people use the same matrix for the Jacobian and for preconditioning,
hence a general scheme probably ought to be able to handle this.
-- next part --
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: 
http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20090318/79ed771a/attachment.pgp