Aldo,

    I understand the situation you are facing. 

    With pseudo-continuation, when you provide an RHS function, SNES is computing udot -
F(u). Inside the SNESSolve(), specifically in TSComputeIFunction(), with the lines

    if (!ts->ops->ifunction) {
      ierr = TSComputeRHSFunction(ts,t,X,Y);CHKERRQ(ierr);
      ierr = VecAYPX(Y,-1,Xdot);CHKERRQ(ierr);

But for the original system, we care about the norm for F(u). So in TSStep_Pseudo(), after the nonlinear solve
there is the line

    pseudo->fnorm = -1; /* The current norm is no longer valid, monitor must recompute it. */

and further down

  if (pseudo->fnorm < 0) {
    PetscCall(VecZeroEntries(pseudo->xdot));
    PetscCall(TSComputeIFunction(ts, ts->ptime, ts->vec_sol, pseudo->xdot, pseudo->func, PETSC_FALSE));
    PetscCall(VecNorm(pseudo->func, NORM_2, &pseudo->fnorm));

this norm is needed to determine of the nonlinear system has converged satisfactory (or if more pseudo time steps are needed).

Hence, your FormRHSFunction gets called twice with identical input. 

To make matters even worse your FormRHSFunction is actually being called THREE times with identical input. Because at the start of TSStep_Pseudo() it calls TSPseudoComputeTimeStep(ts, &next_time_step) which by default uses TSPseudoTimeStepDefault() which has the same chunk of code 

 PetscCall(VecZeroEntries(pseudo->xdot));
  PetscCall(TSComputeIFunction(ts, ts->ptime, ts->vec_sol, pseudo->xdot, pseudo->func, PETSC_FALSE));
  PetscCall(VecNorm(pseudo->func, NORM_2, &pseudo->fnorm));

And the same norm is also recomputed.

Now the diff 

diff --git a/src/ts/impls/pseudo/posindep.c b/src/ts/impls/pseudo/posindep.c
index 7c013435f98..813a28311e7 100644
--- a/src/ts/impls/pseudo/posindep.c
+++ b/src/ts/impls/pseudo/posindep.c
@@ -660,9 +660,11 @@ PetscErrorCode TSPseudoTimeStepDefault(TS ts, PetscReal *newdt, void *dtctx)
   PetscReal  inc    = pseudo->dt_increment;
 
   PetscFunctionBegin;
-  PetscCall(VecZeroEntries(pseudo->xdot));
-  PetscCall(TSComputeIFunction(ts, ts->ptime, ts->vec_sol, pseudo->xdot, pseudo->func, PETSC_FALSE));
-  PetscCall(VecNorm(pseudo->func, NORM_2, &pseudo->fnorm));
+  if (pseudo->fnorm < 0.0) {
+    PetscCall(VecZeroEntries(pseudo->xdot));
+    PetscCall(TSComputeIFunction(ts, ts->ptime, ts->vec_sol, pseudo->xdot, pseudo->func, PETSC_FALSE));
+    PetscCall(VecNorm(pseudo->func, NORM_2, &pseudo->fnorm));
+  }
   if (pseudo->fnorm_initial < 0) {
     /* first time through so compute initial function norm */
     pseudo->fnorm_initial  = pseudo->fnorm

will prevent the THIRD computation of the same function evaluation. So this is a good optimization

But eliminating the second computation is not so trivial. To review, deep inside the TS code there is

TSComputeRHSFunction(ts,t,X,Y);
VecAYPX(Y,-1,Xdot);

This code knows nothing about TSPPseudo etc.

Then later in the Pseudo code

 PetscCall(VecZeroEntries(pseudo->xdot));
 PetscCall(TSComputeIFunction(ts, ts->ptime, ts->vec_sol, pseudo->xdot, pseudo->func, PETSC_FALSE));
 PetscCall(VecNorm(pseudo->func, NORM_2, &pseudo->fnorm));

 again calls TSComputeRHSFunction(ts,t,X,Y); with the same X.  (since Xdot is not used directly by TSComputeRHSFunction()). 

Now if all the code was custom for Pseudo we could simply put in a line 

TSComputeRHSFunction(ts,t,X,Y);
VecNorm(Y,NORM_2, &pseudo->fnorm)
VecAYPX(Y,-1,Xdot);

into TSComputeIFunction() and have the norm available later for pseudo. 

It is not clear that there is any definitive benefit for pseudo to utilize the TS framework. It possibly gets some code reuse from TS but at the cost of extra code that won't be needed if written directly. Plus this performance hit.

Anyways I will make a merge request to remove the third redundant computation of FormRHSFunction but will await input from others on if anything else could be done to remove the inefficiency.

Thanks for reporting this challenging problem. 

 Barry

Note: I don't think providing the Jacobian improves the efficiency, it is just better at hiding the redundant computation.
 







On Aug 26, 2025, at 4:21 AM, Aldo Bonfiglioli <aldo.bonfigli...@unibas.it> wrote:

Hi there,

I am using -ts_type pseudo to find steady solutions of the Euler/NS PDEs.

It looks like FormRHSFunction is called twice with the exactly same value of the dependent variable (u)

before invoking TSRHSJacobianFn and then solving the linear system with KSP (u is then updated)


1 TS dt 0.141197 time 0.141197
Calling FormRHSFunction; ||u|| =    60.489405479528003        463.15635675129079        62.813532026019146        5.9789502719351155     
Calling FormRHSFunction; ||u|| =    60.489405479528003        463.15635675129079        62.813532026019146        5.9789502719351155
Calling TSRHSJacobianFn with ||u|| =    60.489405479528003        463.15635675129079        62.813532026019146        5.9789502719351155
     Linear solve converged due to CONVERGED_RTOL iterations 9
Calling FormRHSFunction; ||u|| =    60.489405476878737        463.15635675602186        62.813532031616965        5.9789502710519811
2 TS dt 0.155917 time 0.297115
Calling FormRHSFunction; ||u|| =    60.489405476878737        463.15635675602186        62.813532031616965        5.9789502710519811
Calling FormRHSFunction; ||u|| =    60.489405476878737        463.15635675602186        62.813532031616965        5.9789502710519811
Calling TSRHSJacobianFn with ||u|| =    60.489405476878737        463.15635675602186        62.813532031616965        5.9789502710519811

Why is it so? Am I possibly misusing (or missing) some options?

My .petscrc file is attached.

Thank you,

Aldo

-- 
Dr. Aldo Bonfiglioli
Associate professor of Fluid Mechanics
Dipartimento di Ingegneria
Universita' della Basilicata
V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
tel:+39.0971.205203 fax:+39.0971.205215
web: http://docenti.unibas.it/site/home/docente.html?m=002423

-dm_plex_dim 2
-dm_plex_filename 
/home/abonfi/grids/2D/aerofoils/naca0012/euler/gmsh/naca0012.msh
#-dm_plex_filename /home/abonfi/grids/2D/aerofoils/naca0012/euler/gmsh/mesh.h5
-testcase 3
##-dm_view
##-dm_plex_view_labels "marker"
##-dm_plex_view_labels "Face Sets"
-freestream_Mach_number .5
-flow_angles 1.,0.,0.
-dm_mat_type baij
##-Reynolds 1.e39
-colors 4,5,-1,-1,-1,-1
-dm_plex_simplex true
-dm_plex_interpolate 
-dm_plex_check_all
-petscpartitioner_view
##-dm_petscsection_view
##-ts_dt 100.
-ts_max_steps 100
-ts_type pseudo
-pc_factor_mat_ordering_type rcm
-pc_factor_reuse_ordering
-ksp_converged_reason
-options_left
##-ksp_view

Reply via email to