Re: [petsc-users] Question regarding SNES error about locked vectors

David Knezevic via petsc-users Mon, 22 Dec 2025 11:59:44 -0800

The print out I get from -snes_view is shown below. I wonder if the issue
is related to "using user-defined postcheck step"?



SNES Object: 1 MPI process
  type: newtonls
  maximum iterations=5, maximum function evaluations=10000
  tolerances: relative=0., absolute=0., solution=0.
  total number of linear solver iterations=3
  total number of function evaluations=4
  norm schedule ALWAYS
  SNESLineSearch Object: 1 MPI process
    type: basic
    maxstep=1.000000e+08, minlambda=1.000000e-12
    tolerances: relative=1.000000e-08, absolute=1.000000e-15,
lambda=1.000000e-08
    maximum iterations=40
    using user-defined postcheck step
  KSP Object: 1 MPI process
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances: relative=1e-05, absolute=1e-50, divergence=10000.
    left preconditioning
    using NONE norm type for convergence test
  PC Object: 1 MPI process
    type: cholesky
      out-of-place factorization
      tolerance for zero pivot 2.22045e-14
      matrix ordering: external
      factor fill ratio given 0., needed 0.
        Factored matrix follows:
          Mat Object: 1 MPI process
            type: mumps
            rows=1152, cols=1152
            package used to perform factorization: mumps
            total: nonzeros=126936, allocated nonzeros=126936
              MUMPS run parameters:
                Use -ksp_view ::ascii_info_detail to display information
for all processes
                RINFOG(1) (global estimated flops for the elimination after
analysis): 1.63461e+07
                RINFOG(2) (global estimated flops for the assembly after
factorization): 74826.
                RINFOG(3) (global estimated flops for the elimination after
factorization): 1.63461e+07
                (RINFOG(12) RINFOG(13))*2^INFOG(34) (determinant):
(0.,0.)*(2^0)
                INFOG(3) (estimated real workspace for factors on all
processors after analysis): 150505
                INFOG(4) (estimated integer workspace for factors on all
processors after analysis): 6276
                INFOG(5) (estimated maximum front size in the complete
tree): 216
                INFOG(6) (number of nodes in the complete tree): 24
                INFOG(7) (ordering option effectively used after analysis):
2
                INFOG(8) (structural symmetry in percent of the permuted
matrix after analysis): 100
                INFOG(9) (total real/complex workspace to store the matrix
factors after factorization): 150505
                INFOG(10) (total integer space store the matrix factors
after factorization): 6276
                INFOG(11) (order of largest frontal matrix after
factorization): 216
                INFOG(12) (number of off-diagonal pivots): 1044
                INFOG(13) (number of delayed pivots after factorization): 0
                INFOG(14) (number of memory compress after factorization): 0
                INFOG(15) (number of steps of iterative refinement after
solution): 0
                INFOG(16) (estimated size (in MB) of all MUMPS internal
data for factorization after analysis: value on the most memory consuming
processor): 2
                INFOG(17) (estimated size of all MUMPS internal data for
factorization after analysis: sum over all processors): 2
                INFOG(18) (size of all MUMPS internal data allocated during
factorization: value on the most memory consuming processor): 2
                INFOG(19) (size of all MUMPS internal data allocated during
factorization: sum over all processors): 2
                INFOG(20) (estimated number of entries in the factors):
126936
                INFOG(21) (size in MB of memory effectively used during
factorization - value on the most memory consuming processor): 2
                INFOG(22) (size in MB of memory effectively used during
factorization - sum over all processors): 2
                INFOG(23) (after analysis: value of ICNTL(6) effectively
used): 0
                INFOG(24) (after analysis: value of ICNTL(12) effectively
used): 1
                INFOG(25) (after factorization: number of pivots modified
by static pivoting): 0
                INFOG(28) (after factorization: number of null pivots
encountered): 0
                INFOG(29) (after factorization: effective number of entries
in the factors (sum over all processors)): 126936
                INFOG(30, 31) (after solution: size in Mbytes of memory
used during solution phase): 2, 2
                INFOG(32) (after analysis: type of analysis done): 1
                INFOG(33) (value used for ICNTL(8)): 7
                INFOG(34) (exponent of the determinant if determinant is
requested): 0
                INFOG(35) (after factorization: number of entries taking
into account BLR factor compression - sum over all processors): 126936
                INFOG(36) (after analysis: estimated size of all MUMPS
internal data for running BLR in-core - value on the most memory consuming
processor): 0
                INFOG(37) (after analysis: estimated size of all MUMPS
internal data for running BLR in-core - sum over all processors): 0
                INFOG(38) (after analysis: estimated size of all MUMPS
internal data for running BLR out-of-core - value on the most memory
consuming processor): 0
                INFOG(39) (after analysis: estimated size of all MUMPS
internal data for running BLR out-of-core - sum over all processors): 0
    linear system matrix = precond matrix:
    Mat Object: 1 MPI process
      type: seqaij
      rows=1152, cols=1152
      total: nonzeros=60480, allocated nonzeros=60480
      total number of mallocs used during MatSetValues calls=0
        using I-node routines: found 384 nodes, limit used is 5



On Mon, Dec 22, 2025 at 9:25 AM Barry Smith <[email protected]> wrote:

>   David,
>
>     This is due to a software glitch. SNES_DIVERGED_FUNCTION_DOMAIN was
> added long after the origins of SNES and, in places, the code was never
> fully updated to handle function domain problems. In particular, parts of
> the line search don't handle it correctly. Can you run with -snes_view and
> that will help us find the spot that needs to be updated.
>
>    Barry
>
>
> On Dec 21, 2025, at 5:53 PM, David Knezevic <[email protected]>
> wrote:
>
> Hi, actually, I have a follow up on this topic.
>
> I noticed that when I call SNESSetFunctionDomainError(), it exits the
> solve as expected, but it leads to a converged reason
> "DIVERGED_LINE_SEARCH" instead of "DIVERGED_FUNCTION_DOMAIN". If I also
> set SNESSetConvergedReason(snes, SNES_DIVERGED_FUNCTION_DOMAIN) in the
> callback, then I get the expected SNES_DIVERGED_FUNCTION_DOMAIN converged
> reason, so that's what I'm doing now. I was surprised by this behavior,
> though, since I expected that calling SNESSetFunctionDomainError woudld
> lead to the DIVERGED_FUNCTION_DOMAIN converged reason, so I just wanted to
> check on what could be causing this.
>
> FYI, I'm using PETSc 3.23.4
>
> Thanks,
> David
>
>
> On Thu, Dec 18, 2025 at 8:10 AM David Knezevic <[email protected]>
> wrote:
>
>> Thank you very much for this guidance. I switched to use
>> SNES_DIVERGED_FUNCTION_DOMAIN, and I don't get any errors now.
>>
>> Thanks!
>> David
>>
>>
>> On Wed, Dec 17, 2025 at 3:43 PM Barry Smith <[email protected]> wrote:
>>
>>>
>>>
>>> On Dec 17, 2025, at 2:47 PM, David Knezevic <[email protected]>
>>> wrote:
>>>
>>> Stefano and Barry: Thank you, this is very helpful.
>>>
>>> I'll give some more info here which may help to clarify further.
>>> Normally we do just get a negative "converged reason", as you described.
>>> But in this specific case where I'm having issues the solve is a
>>> numerically sensitive creep solve, which has exponential terms in the
>>> residual and jacobian callback that can "blow up" and give NaN values. In
>>> this case, the root cause is that we hit a NaN value during a callback, and
>>> then we throw an exception (in libMesh C++ code) which I gather leads to
>>> the SNES solve exiting with this error code.
>>>
>>> Is there a way to tell the SNES to terminate with a negative "converged
>>> reason" because we've encountered some issue during the callback?
>>>
>>>
>>>    In your callback you should call SNESSetFunctionDomainError() and
>>> make sure the function value has an infinity or NaN in it (you can call
>>> VecFlag() for this purpose)).
>>>
>>>    Now SNESConvergedReason will be a completely
>>> reasonable SNES_DIVERGED_FUNCTION_DOMAIN
>>>
>>>   Barry
>>>
>>> If you are using an ancient version of PETSc (I hope you are using the
>>> latest since that always has more bug fixes and features) that does not
>>> have SNESSetFunctionDomainError then just make sure the function vector
>>> result has an infinity or NaN in it and then SNESConvergedReason will be
>>> SNES_DIVERGED_FNORM_NAN
>>>
>>>
>>>
>>>
>>> Thanks,
>>> David
>>>
>>>
>>> On Wed, Dec 17, 2025 at 2:25 PM Barry Smith <[email protected]> wrote:
>>>
>>>>
>>>>
>>>> On Dec 17, 2025, at 2:08 PM, David Knezevic via petsc-users <
>>>> [email protected]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm using PETSc via the libMesh framework, so creating a MWE is
>>>> complicated by that, unfortunately.
>>>>
>>>> The situation is that I am not modifying the solution vector in a
>>>> callback. The SNES solve has terminated, with PetscErrorCode 82, and I then
>>>> want to update the solution vector (reset it to the "previously converged
>>>> value") and then try to solve again with a smaller load increment. This is
>>>> a typical "auto load stepping" strategy in FE.
>>>>
>>>>
>>>>    Once a PetscError is generated you CANNOT continue the PETSc
>>>> program, it is not designed to allow this and trying to continue will lead
>>>> to further problems.
>>>>
>>>>    So what you need to do is prevent PETSc from getting to the point
>>>> where an actual PetscErrorCode of 82 is generated.  Normally SNESSolve()
>>>> returns without generating an error even if the nonlinear solver failed
>>>> (for example did not converge). One then uses SNESGetConvergedReason to
>>>> check if it converged or not. Normally when SNESSolve() returns, regardless
>>>> of whether the converged reason is negative or positive, there will be no
>>>> locked vectors and one can modify the SNES object and call SNESSolve again.
>>>>
>>>>   So my guess is that an actual PETSc error is being generated
>>>> because SNESSetErrorIfNotConverged(snes,PETSC_TRUE) is being called by
>>>> either your code or libMesh or the option -snes_error_if_not_converged is
>>>> being used. In your case when you wish the code to work after a
>>>> non-converged SNESSolve() these options should never be set instead you
>>>> should check the result of SNESGetConvergedReason() to check if SNESSolve
>>>> has failed. If SNESSetErrorIfNotConverged() is never being set that may
>>>> indicate you are using an old version of PETSc or have it a bug inside
>>>> PETSc's SNES that does not handle errors correctly and we can help fix the
>>>> problem if you can provide a full debug output version of when the error
>>>> occurs.
>>>>
>>>>   Barry
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I think the key piece of info I'd like to know is, at what point is the
>>>> solution vector "unlocked" by the SNES object? Should it be unlocked as
>>>> soon as the SNES solve has terminated with PetscErrorCode 82? Since it
>>>> seems to me that it hasn't been unlocked yet (maybe just on a subset of the
>>>> processes). Should I manually "unlock" the solution vector by
>>>> calling VecLockWriteSet?
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>>
>>>> On Wed, Dec 17, 2025 at 2:02 PM Stefano Zampini <
>>>> [email protected]> wrote:
>>>>
>>>>> You are not allowed to call VecGetArray on the solution vector of an
>>>>> SNES object within a user callback, nor to modify its values in any other
>>>>> way.
>>>>> Put in C++ lingo, the solution vector is a "const" argument
>>>>> It would be great if you could provide an MWE to help us understand
>>>>> your problem
>>>>>
>>>>>
>>>>> Il giorno mer 17 dic 2025 alle ore 20:51 David Knezevic via
>>>>> petsc-users <[email protected]> ha scritto:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> I have a question about this error:
>>>>>>
>>>>>>> Vector 'Vec_0x84000005_0' (argument #2) was locked for read-only
>>>>>>> access in unknown_function() at unknown file:0 (line numbers only 
>>>>>>> accurate
>>>>>>> to function begin)
>>>>>>
>>>>>>
>>>>>> I'm encountering this error in an FE solve where there is an error
>>>>>> encountered during the residual/jacobian assembly, and what we normally 
>>>>>> do
>>>>>> in that situation is shrink the load step and continue, starting from the
>>>>>> "last converged solution". However, in this case I'm running on 32
>>>>>> processes, and 5 of the processes report the error above about a "locked
>>>>>> vector".
>>>>>>
>>>>>> We clear the SNES object (via SNESDestroy) before we reset the
>>>>>> solution to the "last converged solution", and then we make a new SNES
>>>>>> object subsequently. But it seems to me that somehow the solution vector 
>>>>>> is
>>>>>> still marked as "locked" on 5 of the processes when we modify the 
>>>>>> solution
>>>>>> vector, which leads to the error above.
>>>>>>
>>>>>> I was wondering if someone could advise on what the best way to
>>>>>> handle this would be? I thought one option could be to add an MPI barrier
>>>>>> call prior to updating the solution vector to "last converged solution", 
>>>>>> to
>>>>>> make sure that the SNES object is destroyed on all procs (and hence the
>>>>>> locks cleared) before editing the solution vector, but I'm unsure if that
>>>>>> would make a difference. Any  help would be most appreciated!
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Stefano
>>>>>
>>>>
>>>>
>>>
>

Re: [petsc-users] Question regarding SNES error about locked vectors

Reply via email to