[petsc-users] Bug report VecNorm

2023-12-09 Thread Stephan Köhler

Dear PETSc/Tao team,

there is a bug in the voector interface:  In the function
VecNorm, see, eg. 
https://petsc.org/release/src/vec/vec/interface/rvector.c.html#VecNorm 
line 197 the check for consistency in line 214 is done on the wrong 
communicator.  The communicator should be PETSC_COMM_SELF.

Otherwise the program may hang when PetscCheck is executed.

Please find a minimal example attached.


Kind regards,
Stephan Köhler

--
Stephan Köhler
TU Bergakademie Freiberg
Institut für numerische Mathematik und Optimierung

Akademiestraße 6
09599 Freiberg
Gebäudeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3188 (Büro)

#include "petscvec.h"

int main(int argc, char **args)
{
  PetscMPIInt size, rank;
  Vec vec;
  PetscReal   norm;
  PetscBool   flg = PETSC_FALSE, minflg = PETSC_FALSE;
  MPI_Commcomm;
  PetscScalar *xx;

  PetscCall(PetscInitialize(&argc, &args, PETSC_NULLPTR, PETSC_NULLPTR));

  comm = PETSC_COMM_WORLD;
  PetscCallMPI(MPI_Comm_size(comm, &size));
  PetscCallMPI(MPI_Comm_rank(comm, &rank));
  PetscCheck(size > 1, comm, PETSC_ERR_ARG_WRONG, "example should be called with more than 1 MPI rank.");

  PetscCall(VecCreateMPI(comm, (rank+1)*10, PETSC_DETERMINE, &vec));
  PetscCall(VecSet(vec, 1.0));
  PetscCall(VecNorm(vec, NORM_INFINITY, &norm));

  PetscSynchronizedPrintf(PETSC_COMM_WORLD, "rank = %d, size = %d, norm = %lf\n", rank, size, norm);
  PetscSynchronizedFlush(comm, PETSC_STDOUT);

  if(rank == 0)
  {
PetscCall(VecGetArrayWrite(vec, &xx));
PetscCall(VecRestoreArrayWrite(vec, &xx));
  }

  PetscCall(VecNormAvailable(vec, NORM_INFINITY, &flg, &norm));

  PetscSynchronizedPrintf(comm, "rank = %d, size = %d, flg = %d, norm = %lf\n", rank, size, flg, norm);
  PetscSynchronizedFlush(comm, PETSC_STDOUT);

  PetscCall(MPIU_Allreduce(&flg, &minflg, 1, MPIU_BOOL, MPI_LAND, PetscObjectComm((PetscObject)vec)));
  /* wrong */
  PetscCheck(flg == minflg, PetscObjectComm((PetscObject)vec), PETSC_ERR_ARG_WRONGSTATE, "Some MPI processes have cached norm, others do not. This may happen when some MPI processes call VecGetArray() and some others do not.");
  /* this is correct */
  // PetscCheck(flg == minflg, PETSC_COMM_SELF, PETSC_ERR_ARG_WRONGSTATE, "Some MPI processes have cached norm, others do not. This may happen when some MPI processes call VecGetArray() and some others do not.");

  PetscCall(VecDestroy(&vec));

  PetscFinalize();

  return 0;

}


OpenPGP_0xC9BF2C20DFE9F713.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: [petsc-users] Bug Report TaoALMM class

2023-12-09 Thread Stephan Köhler

Dear PETSc/Tao team,

this is still an open issue andI haven't heard anything else so far that I'm 
wrong.

Kind regards,
Stephan Köhler

Am 18.07.23 um 02:21 schrieb Matthew Knepley:

Toby and Hansol,

Has anyone looked at this?

   Thanks,

  Matt

On Mon, Jun 12, 2023 at 8:24 AM Stephan Köhler <
stephan.koeh...@math.tu-freiberg.de> wrote:


Dear PETSc/Tao team,

I think there might be a bug in the Tao ALMM class:  In the function
TaoALMMComputeAugLagAndGradient_Private(), see, eg.

https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html#TAOALMM
line 648 the gradient seems to be wrong.

The given function and gradient computation is
Lc = F + Ye^TCe + Yi^T(Ci - S) + 0.5*mu*[Ce^TCe + (Ci - S)^T(Ci - S)],
dLc/dX = dF/dX + Ye^TAe + Yi^TAi + 0.5*mu*[Ce^TAe + (Ci - S)^TAi],

but I think the gradient should be (without 0.5)

dLc/dX = dF/dX + Ye^TAe + Yi^TAi + mu*[Ce^TAe + (Ci - S)^TAi].

Kind regards,
Stephan Köhler

--
Stephan Köhler
TU Bergakademie Freiberg
Institut für numerische Mathematik und Optimierung

Akademiestraße 6
09599 Freiberg
Gebäudeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3173 (Büro)




--
Stephan Köhler
TU Bergakademie Freiberg
Institut für numerische Mathematik und Optimierung

Akademiestraße 6
09599 Freiberg
Gebäudeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3188 (Büro)



OpenPGP_0xC9BF2C20DFE9F713.asc
Description: OpenPGP public key


OpenPGP_signature.asc
Description: OpenPGP digital signature


[petsc-users] Bug Report TaoALMM class

2023-06-12 Thread Stephan Köhler

Dear PETSc/Tao team,

I think there might be a bug in the Tao ALMM class:  In the function
TaoALMMComputeAugLagAndGradient_Private(), see, eg. 
https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html#TAOALMM 
line 648 the gradient seems to be wrong.


The given function and gradient computation is
Lc = F + Ye^TCe + Yi^T(Ci - S) + 0.5*mu*[Ce^TCe + (Ci - S)^T(Ci - S)],
dLc/dX = dF/dX + Ye^TAe + Yi^TAi + 0.5*mu*[Ce^TAe + (Ci - S)^TAi],

but I think the gradient should be (without 0.5)

dLc/dX = dF/dX + Ye^TAe + Yi^TAi + mu*[Ce^TAe + (Ci - S)^TAi].

Kind regards,
Stephan Köhler

--
Stephan Köhler
TU Bergakademie Freiberg
Institut für numerische Mathematik und Optimierung

Akademiestraße 6
09599 Freiberg
Gebäudeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3173 (Büro)



OpenPGP_0xC9BF2C20DFE9F713.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[petsc-users] Bug report LMVM matrix class

2023-05-09 Thread Stephan Köhler

Dear PETSc/Tao team,

it seems to be that there is a bug in the LMVM matrix class:

The function MatMultAdd_LMVM, see, e.g., 
https://petsc.org/release/src/ksp/ksp/utils/lmvm/lmvmimpl.c.html at line 
114, if the vectors Y and Z are the same, then the result is wrong, 
since the first MatMult overwrites also the value in Y.


Best regards
Stephan Köhler

--
Stephan Köhler
TU Bergakademie Freiberg
Institut für numerische Mathematik und Optimierung

Akademiestraße 6
09599 Freiberg
Gebäudeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3173 (Büro)



OpenPGP_0xC9BF2C20DFE9F713.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


Re: [petsc-users] Report Bug TaoALMM class

2022-11-04 Thread Stephan Köhler

Barry,

this is a nonartificial code.  This is a problem in the ALMM subsolver.  
I want to solve a problem with a TaoALMM solver what then happens is:


TaoSolve(tao)    /* TaoALMM solver */
   |
   |
   |>   This calls the TaoALMM subsolver routine

 TaoSolve(subsolver)
   |
   |
   |--->   The subsolver does not correctly 
work, at least with an Armijo line search, since the solution is 
overwritten within the line search.
   In my case, the subsolver does 
not make any progress although it is possible.


To get to my real problem you can simply change line 268 to if(0) (from 
if(1) -> if(0)) and line 317 from // ierr = TaoSolve(tao); 
CHKERRQ(ierr);  ---> ierr = TaoSolve(tao); CHKERRQ(ierr);
What you can see is that the solver does not make any progress, but it 
should make progress.


To be honest, I do not really know why the option 
-tao_almm_subsolver_tao_ls_monitor has know effect if the ALMM solver is 
called and not the subsolver. I also do not know why 
-tao_almm_subsolver_tao_view prints as termination reason for the subsolver


 Solution converged:    ||g(X)|| <= gatol

This is obviously not the case.  I set the tolerance
-tao_almm_subsolver_tao_gatol 1e-8 \
-tao_almm_subsolver_tao_grtol 1e-8 \

I encountered this and then I looked into the ALMM class and therefore I 
tried to call the subsolver (previous example).


I attach the updated programm and also the options.

Stephan





<https://www.dict.cc/?s=obviously>
On 03.11.22 22:15, Barry Smith wrote:


  Thanks for your response and the code. I understand the potential 
problem and how your code demonstrates a bug if the 
TaoALMMSubsolverObjective() is used in the manner you use in the 
example where you directly call TaoComputeObjective() multiple times 
line a line search code might.


  What I don't have or understand is how to reproduce the problem in a 
real code that uses Tao. That is where the Tao Armijo line search code 
has a problem when it is used (somehow) in a Tao solver with ALMM. You 
suggest "If you have an example for your own, you can switch the 
Armijo line search by the option -tao_ls_type armijo.  The thing is 
that it will cause no problems if the line search accepts the steps 
with step length one."  I don't see how to do this if I use -tao_type 
almm I cannot use -tao_ls_type armijo; that is the option -tao_ls_type 
doesn't seem to me to be usable in the context of almm (since almm 
internally does directly its own trust region approach for 
globalization). If we remove the if (1) code from your example, is 
there some Tao options I can use to get the bug to appear inside the 
Tao solve?


I'll try to explain again, I agree that the fact that the Tao solution 
is aliased (within the ALMM solver) is a problem with repeated calls 
to TaoComputeObjective() but I cannot see how these repeated calls 
could ever happen in the use of TaoSolve() with the ALMM solver. That 
is when is this "design problem" a true problem as opposed to just a 
potential problem that can be demonstrated in artificial code?


The reason I need to understand the non-artificial situation it breaks 
things is to come up with an appropriate correction for the current code.


  Barry







On Nov 3, 2022, at 12:46 PM, Stephan Köhler 
 wrote:


Barry,

so far, I have not experimented with trust-region methods, but I can 
imagine that this "design feature" causes no problem for trust-region 
methods, if the old point is saved and after the trust-region check 
fails the old point is copied to the actual point.  But the 
implementation of the Armijo line search method does not work that 
way.  Here, the actual point will always be overwritten.  Only if the 
line search fails, then the old point is restored, but then the 
TaoSolve method ends with a line search failure.


If you have an example for your own, you can switch the Armijo line 
search by the option -tao_ls_type armijo.  The thing is that it will 
cause no problems if the line search accepts the steps with step 
length one.
It is also possible that, by luck, it will cause no problems, if the 
"excessive" step brings a reduction of the objective


Otherwise, I attach my example, which is not minimal, but here you 
can see that it causes problems.  You need to set the paths to the 
PETSc library in the makefile.  You find the options for this problem 
in the run_test_tao_neohooke.sh script.

The import part begins at line 292 in test_tao_neohooke.cpp

Stephan

On 02.11.22 19:04, Barry Smith wrote:

   Stephan,

 I have located the troublesome line in TaoSetUp_ALMM() it has the line

   auglag->Px = tao->solution;

and in alma.h it has

Vec  Px, LgradX, Ce, Ci, G; /* aliased vectors (do not destroy!) */

Now auglag->P in some situations alias 

Re: [petsc-users] Report Bug TaoALMM class

2022-11-03 Thread Stephan Köhler

Barry,

so far, I have not experimented with trust-region methods, but I can 
imagine that this "design feature" causes no problem for trust-region 
methods, if the old point is saved and after the trust-region check 
fails the old point is copied to the actual point.  But the 
implementation of the Armijo line search method does not work that way.  
Here, the actual point will always be overwritten.  Only if the line 
search fails, then the old point is restored, but then the TaoSolve 
method ends with a line search failure.


If you have an example for your own, you can switch the Armijo line 
search by the option -tao_ls_type armijo.  The thing is that it will 
cause no problems if the line search accepts the steps with step length 
one.
It is also possible that, by luck, it will cause no problems, if the 
"excessive" step brings a reduction of the objective


Otherwise, I attach my example, which is not minimal, but here you can 
see that it causes problems.  You need to set the paths to the PETSc 
library in the makefile.  You find the options for this problem in the 
run_test_tao_neohooke.sh script.

The import part begins at line 292 in test_tao_neohooke.cpp

Stephan

On 02.11.22 19:04, Barry Smith wrote:

   Stephan,

 I have located the troublesome line in TaoSetUp_ALMM() it has the line

   auglag->Px = tao->solution;

and in alma.h it has

Vec  Px, LgradX, Ce, Ci, G; /* aliased vectors (do not destroy!) */

Now auglag->P in some situations alias auglag->P  and in some cases auglag->Px 
serves to hold a portion of auglag->P. So then in TaoALMMSubsolverObjective_Private()
the lines

PetscCall(VecCopy(P, auglag->P));
  PetscCall((*auglag->sub_obj)(auglag->parent));

causes, just as you said, tao->solution to be overwritten by the P at which the 
objective function is being computed. In other words, the solution of the outer 
Tao is aliased with the solution of the inner Tao, by design.

You are definitely correct, the use of TaoALMMSubsolverObjective_Private and 
TaoALMMSubsolverObjectiveAndGradient_Private  in a line search would be 
problematic.

I am not an expert at these methods or their implementations. Could you point to an 
actual use case within Tao that triggers the problem. Is there a set of command line 
options or code calls to Tao that fail due to this "design feature". Within the 
standard use of ALMM I do not see how the objective function would be used within a line 
search. The TaoSolve_ALMM() code is self-correcting in that if a trust region check fails 
it automatically rolls back the solution.

   Barry





On Oct 28, 2022, at 4:27 AM, Stephan 
Köhler  wrote:

Dear PETSc/Tao team,

it seems to be that there is a bug in the TaoALMM class:

In the methods TaoALMMSubsolverObjective_Private and 
TaoALMMSubsolverObjectiveAndGradient_Private the vector where the function 
value for the augmented Lagrangian is evaluate
is copied into the current solution, see, 
e.g.,https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html  line 
672 or 682.  This causes subsolver routine to not converge if the line search 
for the subsolver rejects the step length 1. for some
update.  In detail:

Suppose the current iterate is xk and the current update is dxk. The line 
search evaluates the augmented Lagrangian now at (xk + dxk).  This causes that 
the value (xk + dxk) is copied in the current solution.  If the point (xk + 
dxk) is rejected, the line search should
try the point (xk + alpha * dxk), where alpha < 1.  But due to the copying, 
what happens is that the point ((xk + dxk) + alpha * dxk) is evaluated, see, 
e.g.,https://petsc.org/release/src/tao/linesearch/impls/armijo/armijo.c.html  line 
191.

Best regards
Stephan Köhler

--
Stephan Köhler
TU Bergakademie Freiberg
Institut für numerische Mathematik und Optimierung

Akademiestraße 6
09599 Freiberg
Gebäudeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3173 (Büro)




--
Stephan Köhler
TU Bergakademie Freiberg
Institut für numerische Mathematik und Optimierung

Akademiestraße 6
09599 Freiberg
Gebäudeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3173 (Büro)



Minimal_example_without_vtk_2.tar.gz
Description: application/gzip


OpenPGP_0xC9BF2C20DFE9F713.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[petsc-users] Bug report LMVM matrix class

2022-11-02 Thread Stephan Köhler

Dear PETSc/Tao team,

it seems to be that there is a bug in the LMVM matrix class:

In the function MatCreateVecs_LMVM, see, e.g., 
https://petsc.org/release/src/ksp/ksp/utils/lmvm/lmvmimpl.c.html at line 
214.
it is not checked if the vectors  *L, or *R are NULL.  This is, in 
particular, a problem if this matrix class is combined with the Schur 
complement matrix class, since MatMult_SchurComplement
calls this function with NULL as *R, see, e.g. 
https://petsc.org/release/src/ksp/ksp/utils/schurm/schurm.c.html at line 66.


I attach a minimal example.  You need to modify the paths to the PETSc 
installation in the makefile.


Best regards
Stephan Köhler

--
Stephan Köhler
TU Bergakademie Freiberg
Institut für numerische Mathematik und Optimierung

Akademiestraße 6
09599 Freiberg
Gebäudeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3173 (Büro)



Minimal_example_schur_lmvm.tar.gz
Description: application/gzip


OpenPGP_0xC9BF2C20DFE9F713.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[petsc-users] Report Bug TaoALMM class

2022-10-28 Thread Stephan Köhler

Dear PETSc/Tao team,

it seems to be that there is a bug in the TaoALMM class:

In the methods TaoALMMSubsolverObjective_Private and 
TaoALMMSubsolverObjectiveAndGradient_Private the vector where the 
function value for the augmented Lagrangian is evaluate
is copied into the current solution, see, e.g., 
https://petsc.org/release/src/tao/constrained/impls/almm/almm.c.html 
line 672 or 682.  This causes subsolver routine to not converge if the 
line search for the subsolver rejects the step length 1. for some

update.  In detail:

Suppose the current iterate is xk and the current update is dxk. The 
line search evaluates the augmented Lagrangian now at (xk + dxk).  This 
causes that the value (xk + dxk) is copied in the current solution.  If 
the point (xk + dxk) is rejected, the line search should
try the point (xk + alpha * dxk), where alpha < 1.  But due to the 
copying, what happens is that the point ((xk + dxk) + alpha * dxk) is 
evaluated, see, e.g., 
https://petsc.org/release/src/tao/linesearch/impls/armijo/armijo.c.html 
line 191.


Best regards
Stephan Köhler

--
Stephan Köhler
TU Bergakademie Freiberg
Institut für numerische Mathematik und Optimierung

Akademiestraße 6
09599 Freiberg
Gebäudeteil Mittelbau, Zimmer 2.07

Telefon: +49 (0)3731 39-3173 (Büro)



OpenPGP_0xC9BF2C20DFE9F713.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature