Re: [petsc-users] sources of floating point randomness in JFNK in serial

Matthew Knepley Thu, 04 May 2023 14:23:07 -0700

On Thu, May 4, 2023 at 5:03 PM Mark Lohry <mlo...@gmail.com> wrote:


> Do you get different results (in different runs) without
>>  -snes_mf_operator? So just using an explicit matrix?
>
>
> Unfortunately I don't have an explicit matrix available for this, hence
> the MFFD/JFNK.
>

I don't mean the actual matrix, I mean a representative matrix.


>
>>   (Note: I am not convinced there is even a problem and think it may be
>> simply different order of floating point operations in different runs.)
>>
>
> I'm not convinced either, but running explicit RK for 10,000 iterations i
> get exactly the same results every time so i'm fairly confident it's not
> the residual evaluation.
> How would there be a different order of floating point ops in different
> runs in serial?
>
> No, I mean without -snes_mf_* (as Barry says), so we are just running that
>> solver with a sparse matrix. This would give me confidence
>> that nothing in the solver is variable.
>>
>> I could do the sparse finite difference jacobian once, save it to disk,
> and then use that system each time.
>

Yes. That would work.

  Thanks,

     Matt


> On Thu, May 4, 2023 at 4:57 PM Matthew Knepley <knep...@gmail.com> wrote:
>
>> On Thu, May 4, 2023 at 4:44 PM Mark Lohry <mlo...@gmail.com> wrote:
>>
>>> Is your code valgrind clean?
>>>>
>>>
>>> Yes, I also initialize all allocations with NaNs to be sure I'm not
>>> using anything uninitialized.
>>>
>>>
>>>> We can try and test this. Replace your MatMFFD with an actual matrix
>>>> and run. Do you see any variability?
>>>>
>>>
>>> I think I did what you're asking. I have -snes_mf_operator set, and then
>>> SNESSetJacobian(snes, diag_ones, diag_ones, NULL, NULL) where diag_ones is
>>> a matrix with ones on the diagonal. Two runs below, still with differences
>>> but sometimes identical.
>>>
>>
>> No, I mean without -snes_mf_* (as Barry says), so we are just running
>> that solver with a sparse matrix. This would give me confidence
>> that nothing in the solver is variable.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>>   0 SNES Function norm 3.424003312857e+04
>>>     0 KSP Residual norm 3.424003312857e+04
>>>     1 KSP Residual norm 2.871734444536e+04
>>>     2 KSP Residual norm 2.490276930242e+04
>>>     3 KSP Residual norm 2.131675872968e+04
>>>     4 KSP Residual norm 1.973129814235e+04
>>>     5 KSP Residual norm 1.832377856317e+04
>>>     6 KSP Residual norm 1.716783617436e+04
>>>     7 KSP Residual norm 1.583963149542e+04
>>>     8 KSP Residual norm 1.482272170304e+04
>>>     9 KSP Residual norm 1.380312106742e+04
>>>    10 KSP Residual norm 1.297793480658e+04
>>>    11 KSP Residual norm 1.208599123244e+04
>>>    12 KSP Residual norm 1.137345655227e+04
>>>    13 KSP Residual norm 1.059676909366e+04
>>>    14 KSP Residual norm 1.003823862398e+04
>>>    15 KSP Residual norm 9.425879221354e+03
>>>    16 KSP Residual norm 8.954805890038e+03
>>>    17 KSP Residual norm 8.592372470456e+03
>>>    18 KSP Residual norm 8.060707175821e+03
>>>    19 KSP Residual norm 7.782057728723e+03
>>>    20 KSP Residual norm 7.449686095424e+03
>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>>   type: gmres
>>>     restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>>     happy breakdown tolerance 1e-30
>>>   maximum iterations=20, initial guess is zero
>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>>   type: none
>>>   linear system matrix followed by preconditioner matrix:
>>>   Mat Object: 1 MPI process
>>>     type: mffd
>>>     rows=16384, cols=16384
>>>       Matrix-free approximation:
>>>         err=1.49012e-08 (relative error in function evaluation)
>>>         Using wp compute h routine
>>>             Does not compute normU
>>>   Mat Object: 1 MPI process
>>>     type: seqaij
>>>     rows=16384, cols=16384
>>>     total: nonzeros=16384, allocated nonzeros=16384
>>>     total number of mallocs used during MatSetValues calls=0
>>>       not using I-node routines
>>>   1 SNES Function norm 1.085015646971e+04
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>>   type: newtonls
>>>   maximum iterations=1, maximum function evaluations=-1
>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>   total number of linear solver iterations=20
>>>   total number of function evaluations=23
>>>   norm schedule ALWAYS
>>>   Jacobian is never rebuilt
>>>   Jacobian is applied matrix-free with differencing
>>>   Preconditioning Jacobian is built using finite differences with
>>> coloring
>>>   SNESLineSearch Object: 1 MPI process
>>>     type: basic
>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>> lambda=1.000000e-08
>>>     maximum iterations=40
>>>   KSP Object: 1 MPI process
>>>     type: gmres
>>>       restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>>       happy breakdown tolerance 1e-30
>>>     maximum iterations=20, initial guess is zero
>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>     left preconditioning
>>>     using PRECONDITIONED norm type for convergence test
>>>   PC Object: 1 MPI process
>>>     type: none
>>>     linear system matrix followed by preconditioner matrix:
>>>     Mat Object: 1 MPI process
>>>       type: mffd
>>>       rows=16384, cols=16384
>>>         Matrix-free approximation:
>>>           err=1.49012e-08 (relative error in function evaluation)
>>>           Using wp compute h routine
>>>               Does not compute normU
>>>     Mat Object: 1 MPI process
>>>       type: seqaij
>>>       rows=16384, cols=16384
>>>       total: nonzeros=16384, allocated nonzeros=16384
>>>       total number of mallocs used during MatSetValues calls=0
>>>         not using I-node routines
>>>
>>>   0 SNES Function norm 3.424003312857e+04
>>>     0 KSP Residual norm 3.424003312857e+04
>>>     1 KSP Residual norm 2.871734444536e+04
>>>     2 KSP Residual norm 2.490276931041e+04
>>>     3 KSP Residual norm 2.131675873776e+04
>>>     4 KSP Residual norm 1.973129814908e+04
>>>     5 KSP Residual norm 1.832377852186e+04
>>>     6 KSP Residual norm 1.716783608174e+04
>>>     7 KSP Residual norm 1.583963128956e+04
>>>     8 KSP Residual norm 1.482272160069e+04
>>>     9 KSP Residual norm 1.380312087005e+04
>>>    10 KSP Residual norm 1.297793458796e+04
>>>    11 KSP Residual norm 1.208599115602e+04
>>>    12 KSP Residual norm 1.137345657533e+04
>>>    13 KSP Residual norm 1.059676906197e+04
>>>    14 KSP Residual norm 1.003823857515e+04
>>>    15 KSP Residual norm 9.425879177747e+03
>>>    16 KSP Residual norm 8.954805850825e+03
>>>    17 KSP Residual norm 8.592372413320e+03
>>>    18 KSP Residual norm 8.060706994110e+03
>>>    19 KSP Residual norm 7.782057560782e+03
>>>    20 KSP Residual norm 7.449686034356e+03
>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>> KSP Object: 1 MPI process
>>>   type: gmres
>>>     restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>>     happy breakdown tolerance 1e-30
>>>   maximum iterations=20, initial guess is zero
>>>   tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object: 1 MPI process
>>>   type: none
>>>   linear system matrix followed by preconditioner matrix:
>>>   Mat Object: 1 MPI process
>>>     type: mffd
>>>     rows=16384, cols=16384
>>>       Matrix-free approximation:
>>>         err=1.49012e-08 (relative error in function evaluation)
>>>         Using wp compute h routine
>>>             Does not compute normU
>>>   Mat Object: 1 MPI process
>>>     type: seqaij
>>>     rows=16384, cols=16384
>>>     total: nonzeros=16384, allocated nonzeros=16384
>>>     total number of mallocs used during MatSetValues calls=0
>>>       not using I-node routines
>>>   1 SNES Function norm 1.085015821006e+04
>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>> SNES Object: 1 MPI process
>>>   type: newtonls
>>>   maximum iterations=1, maximum function evaluations=-1
>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>   total number of linear solver iterations=20
>>>   total number of function evaluations=23
>>>   norm schedule ALWAYS
>>>   Jacobian is never rebuilt
>>>   Jacobian is applied matrix-free with differencing
>>>   Preconditioning Jacobian is built using finite differences with
>>> coloring
>>>   SNESLineSearch Object: 1 MPI process
>>>     type: basic
>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>> lambda=1.000000e-08
>>>     maximum iterations=40
>>>   KSP Object: 1 MPI process
>>>     type: gmres
>>>       restart=30, using Classical (unmodified) Gram-Schmidt
>>> Orthogonalization with no iterative refinement
>>>       happy breakdown tolerance 1e-30
>>>     maximum iterations=20, initial guess is zero
>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>     left preconditioning
>>>     using PRECONDITIONED norm type for convergence test
>>>   PC Object: 1 MPI process
>>>     type: none
>>>     linear system matrix followed by preconditioner matrix:
>>>     Mat Object: 1 MPI process
>>>       type: mffd
>>>       rows=16384, cols=16384
>>>         Matrix-free approximation:
>>>           err=1.49012e-08 (relative error in function evaluation)
>>>           Using wp compute h routine
>>>               Does not compute normU
>>>     Mat Object: 1 MPI process
>>>       type: seqaij
>>>       rows=16384, cols=16384
>>>       total: nonzeros=16384, allocated nonzeros=16384
>>>       total number of mallocs used during MatSetValues calls=0
>>>         not using I-node routines
>>>
>>> On Thu, May 4, 2023 at 10:10 AM Matthew Knepley <knep...@gmail.com>
>>> wrote:
>>>
>>>> On Thu, May 4, 2023 at 8:54 AM Mark Lohry <mlo...@gmail.com> wrote:
>>>>
>>>>> Try -pc_type none.
>>>>>>
>>>>>
>>>>> With -pc_type none the 0 KSP residual looks identical. But *sometimes*
>>>>> it's producing exactly the same history and others it's gradually
>>>>> changing.  I'm reasonably confident my residual evaluation has no
>>>>> randomness, see info after the petsc output.
>>>>>
>>>>
>>>> We can try and test this. Replace your MatMFFD with an actual matrix
>>>> and run. Do you see any variability?
>>>>
>>>> If not, then it could be your routine, or it could be MatMFFD. So run a
>>>> few with -snes_view, and we can see if the
>>>> "w" parameter changes.
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> solve history 1:
>>>>>
>>>>>   0 SNES Function norm 3.424003312857e+04
>>>>>     0 KSP Residual norm 3.424003312857e+04
>>>>>     1 KSP Residual norm 2.871734444536e+04
>>>>>     2 KSP Residual norm 2.490276931041e+04
>>>>> ...
>>>>>    20 KSP Residual norm 7.449686034356e+03
>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>   1 SNES Function norm 1.085015821006e+04
>>>>>
>>>>> solve history 2, identical to 1:
>>>>>
>>>>>   0 SNES Function norm 3.424003312857e+04
>>>>>     0 KSP Residual norm 3.424003312857e+04
>>>>>     1 KSP Residual norm 2.871734444536e+04
>>>>>     2 KSP Residual norm 2.490276931041e+04
>>>>> ...
>>>>>    20 KSP Residual norm 7.449686034356e+03
>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>   1 SNES Function norm 1.085015821006e+04
>>>>>
>>>>> solve history 3, identical KSP at 0 and 1, slight change at 2, growing
>>>>> difference to the end:
>>>>>   0 SNES Function norm 3.424003312857e+04
>>>>>     0 KSP Residual norm 3.424003312857e+04
>>>>>     1 KSP Residual norm 2.871734444536e+04
>>>>>     2 KSP Residual norm 2.490276930242e+04
>>>>> ...
>>>>>  20 KSP Residual norm 7.449686095424e+03
>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>   1 SNES Function norm 1.085015646971e+04
>>>>>
>>>>>
>>>>> Ths is using a standard explicit 3-stage Runge-Kutta smoother for 10
>>>>> iterations, so 30 calls of the same residual evaluation, identical
>>>>> residuals every time
>>>>>
>>>>> run 1:
>>>>>
>>>>> # iteration            rho                 rhou                rhov
>>>>>              rhoE                abs_res             rel_res
>>>>> umin                vmax                vmin                elapsed_time
>>>>>
>>>>> #
>>>>>
>>>>>
>>>>>           1.00000e+00  1.086860616292e+00  2.782316758416e+02
>>>>>  4.482867643761e+00  2.993435920340e+02         2.04353e+02
>>>>> 1.00000e+00        -8.23945e-15        -6.15326e-15        -1.35563e-14
>>>>>     6.34834e-01
>>>>>           2.00000e+00  2.310547487017e+00  1.079059352425e+02
>>>>>  3.958323921837e+00  5.058927165686e+02         2.58647e+02
>>>>> 1.26568e+00        -1.02539e-14        -9.35368e-15        -1.69925e-14
>>>>>     6.40063e-01
>>>>>           3.00000e+00  2.361005867444e+00  5.706213331683e+01
>>>>>  6.130016323357e+00  4.688968362579e+02         2.36201e+02
>>>>> 1.15585e+00        -1.19370e-14        -1.15216e-14        -1.59733e-14
>>>>>     6.45166e-01
>>>>>           4.00000e+00  2.167518999963e+00  3.757541401594e+01
>>>>>  6.313917437428e+00  4.054310291628e+02         2.03612e+02
>>>>> 9.96372e-01        -1.81831e-14        -1.28312e-14        -1.46238e-14
>>>>>     6.50494e-01
>>>>>           5.00000e+00  1.941443738676e+00  2.884190334049e+01
>>>>>  6.237106158479e+00  3.539201037156e+02         1.77577e+02
>>>>> 8.68970e-01         3.56633e-14        -8.74089e-15        -1.06666e-14
>>>>>     6.55656e-01
>>>>>           6.00000e+00  1.736947124693e+00  2.429485695670e+01
>>>>>  5.996962200407e+00  3.148280178142e+02         1.57913e+02
>>>>> 7.72745e-01        -8.98634e-14        -2.41152e-14        -1.39713e-14
>>>>>     6.60872e-01
>>>>>           7.00000e+00  1.564153212635e+00  2.149609219810e+01
>>>>>  5.786910705204e+00  2.848717011033e+02         1.42872e+02
>>>>> 6.99144e-01        -2.95352e-13        -2.48158e-14        -2.39351e-14
>>>>>     6.66041e-01
>>>>>           8.00000e+00  1.419280815384e+00  1.950619804089e+01
>>>>>  5.627281158306e+00  2.606623371229e+02         1.30728e+02
>>>>> 6.39715e-01         8.98941e-13         1.09674e-13         3.78905e-14
>>>>>     6.71316e-01
>>>>>           9.00000e+00  1.296115915975e+00  1.794843530745e+01
>>>>>  5.514933264437e+00  2.401524522393e+02         1.20444e+02
>>>>> 5.89394e-01         1.70717e-12         1.38762e-14         1.09825e-13
>>>>>     6.76447e-01
>>>>>           1.00000e+01  1.189639693918e+00  1.665381754953e+01
>>>>>  5.433183087037e+00  2.222572900473e+02         1.11475e+02
>>>>> 5.45501e-01        -4.22462e-12        -7.15206e-13        -2.28736e-13
>>>>>     6.81716e-01
>>>>>
>>>>> run N:
>>>>>
>>>>>
>>>>> #
>>>>>
>>>>>
>>>>> # iteration            rho                 rhou                rhov
>>>>>              rhoE                abs_res             rel_res
>>>>> umin                vmax                vmin                elapsed_time
>>>>>
>>>>> #
>>>>>
>>>>>
>>>>>           1.00000e+00  1.086860616292e+00  2.782316758416e+02
>>>>>  4.482867643761e+00  2.993435920340e+02         2.04353e+02
>>>>> 1.00000e+00        -8.23945e-15        -6.15326e-15        -1.35563e-14
>>>>>     6.23316e-01
>>>>>           2.00000e+00  2.310547487017e+00  1.079059352425e+02
>>>>>  3.958323921837e+00  5.058927165686e+02         2.58647e+02
>>>>> 1.26568e+00        -1.02539e-14        -9.35368e-15        -1.69925e-14
>>>>>     6.28510e-01
>>>>>           3.00000e+00  2.361005867444e+00  5.706213331683e+01
>>>>>  6.130016323357e+00  4.688968362579e+02         2.36201e+02
>>>>> 1.15585e+00        -1.19370e-14        -1.15216e-14        -1.59733e-14
>>>>>     6.33558e-01
>>>>>           4.00000e+00  2.167518999963e+00  3.757541401594e+01
>>>>>  6.313917437428e+00  4.054310291628e+02         2.03612e+02
>>>>> 9.96372e-01        -1.81831e-14        -1.28312e-14        -1.46238e-14
>>>>>     6.38773e-01
>>>>>           5.00000e+00  1.941443738676e+00  2.884190334049e+01
>>>>>  6.237106158479e+00  3.539201037156e+02         1.77577e+02
>>>>> 8.68970e-01         3.56633e-14        -8.74089e-15        -1.06666e-14
>>>>>     6.43887e-01
>>>>>           6.00000e+00  1.736947124693e+00  2.429485695670e+01
>>>>>  5.996962200407e+00  3.148280178142e+02         1.57913e+02
>>>>> 7.72745e-01        -8.98634e-14        -2.41152e-14        -1.39713e-14
>>>>>     6.49073e-01
>>>>>           7.00000e+00  1.564153212635e+00  2.149609219810e+01
>>>>>  5.786910705204e+00  2.848717011033e+02         1.42872e+02
>>>>> 6.99144e-01        -2.95352e-13        -2.48158e-14        -2.39351e-14
>>>>>     6.54167e-01
>>>>>           8.00000e+00  1.419280815384e+00  1.950619804089e+01
>>>>>  5.627281158306e+00  2.606623371229e+02         1.30728e+02
>>>>> 6.39715e-01         8.98941e-13         1.09674e-13         3.78905e-14
>>>>>     6.59394e-01
>>>>>           9.00000e+00  1.296115915975e+00  1.794843530745e+01
>>>>>  5.514933264437e+00  2.401524522393e+02         1.20444e+02
>>>>> 5.89394e-01         1.70717e-12         1.38762e-14         1.09825e-13
>>>>>     6.64516e-01
>>>>>           1.00000e+01  1.189639693918e+00  1.665381754953e+01
>>>>>  5.433183087037e+00  2.222572900473e+02         1.11475e+02
>>>>> 5.45501e-01        -4.22462e-12        -7.15206e-13        -2.28736e-13
>>>>>     6.69677e-01
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Thu, May 4, 2023 at 8:41 AM Mark Adams <mfad...@lbl.gov> wrote:
>>>>>
>>>>>> ASM is just the sub PC with one proc but gets weaker with more procs
>>>>>> unless you use jacobi. (maybe I am missing something).
>>>>>>
>>>>>> On Thu, May 4, 2023 at 8:31 AM Mark Lohry <mlo...@gmail.com> wrote:
>>>>>>
>>>>>>>  Please send the output of -snes_view.
>>>>>>>>
>>>>>>> pasted below. anything stand out?
>>>>>>>
>>>>>>>
>>>>>>> SNES Object: 1 MPI process
>>>>>>>   type: newtonls
>>>>>>>   maximum iterations=1, maximum function evaluations=-1
>>>>>>>   tolerances: relative=0.1, absolute=1e-15, solution=1e-15
>>>>>>>   total number of linear solver iterations=20
>>>>>>>   total number of function evaluations=22
>>>>>>>   norm schedule ALWAYS
>>>>>>>   Jacobian is never rebuilt
>>>>>>>   Jacobian is applied matrix-free with differencing
>>>>>>>   Preconditioning Jacobian is built using finite differences with
>>>>>>> coloring
>>>>>>>   SNESLineSearch Object: 1 MPI process
>>>>>>>     type: basic
>>>>>>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>>>>>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>>>>>>> lambda=1.000000e-08
>>>>>>>     maximum iterations=40
>>>>>>>   KSP Object: 1 MPI process
>>>>>>>     type: gmres
>>>>>>>       restart=30, using Classical (unmodified) Gram-Schmidt
>>>>>>> Orthogonalization with no iterative refinement
>>>>>>>       happy breakdown tolerance 1e-30
>>>>>>>     maximum iterations=20, initial guess is zero
>>>>>>>     tolerances:  relative=0.1, absolute=1e-15, divergence=10.
>>>>>>>     left preconditioning
>>>>>>>     using PRECONDITIONED norm type for convergence test
>>>>>>>   PC Object: 1 MPI process
>>>>>>>     type: asm
>>>>>>>       total subdomain blocks = 1, amount of overlap = 0
>>>>>>>       restriction/interpolation type - RESTRICT
>>>>>>>       Local solver information for first block is in the following
>>>>>>> KSP and PC objects on rank 0:
>>>>>>>       Use -ksp_view ::ascii_info_detail to display information for
>>>>>>> all blocks
>>>>>>>     KSP Object: (sub_) 1 MPI process
>>>>>>>       type: preonly
>>>>>>>       maximum iterations=10000, initial guess is zero
>>>>>>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>>>>>>       left preconditioning
>>>>>>>       using NONE norm type for convergence test
>>>>>>>     PC Object: (sub_) 1 MPI process
>>>>>>>       type: ilu
>>>>>>>         out-of-place factorization
>>>>>>>         0 levels of fill
>>>>>>>         tolerance for zero pivot 2.22045e-14
>>>>>>>         matrix ordering: natural
>>>>>>>         factor fill ratio given 1., needed 1.
>>>>>>>           Factored matrix follows:
>>>>>>>             Mat Object: (sub_) 1 MPI process
>>>>>>>               type: seqbaij
>>>>>>>               rows=16384, cols=16384, bs=16
>>>>>>>               package used to perform factorization: petsc
>>>>>>>               total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>                   block size is 16
>>>>>>>       linear system matrix = precond matrix:
>>>>>>>       Mat Object: (sub_) 1 MPI process
>>>>>>>         type: seqbaij
>>>>>>>         rows=16384, cols=16384, bs=16
>>>>>>>         total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>         total number of mallocs used during MatSetValues calls=0
>>>>>>>             block size is 16
>>>>>>>     linear system matrix followed by preconditioner matrix:
>>>>>>>     Mat Object: 1 MPI process
>>>>>>>       type: mffd
>>>>>>>       rows=16384, cols=16384
>>>>>>>         Matrix-free approximation:
>>>>>>>           err=1.49012e-08 (relative error in function evaluation)
>>>>>>>           Using wp compute h routine
>>>>>>>               Does not compute normU
>>>>>>>     Mat Object: 1 MPI process
>>>>>>>       type: seqbaij
>>>>>>>       rows=16384, cols=16384, bs=16
>>>>>>>       total: nonzeros=1277952, allocated nonzeros=1277952
>>>>>>>       total number of mallocs used during MatSetValues calls=0
>>>>>>>           block size is 16
>>>>>>>
>>>>>>> On Thu, May 4, 2023 at 8:30 AM Mark Adams <mfad...@lbl.gov> wrote:
>>>>>>>
>>>>>>>> If you are using MG what is the coarse grid solver?
>>>>>>>> -snes_view might give you that.
>>>>>>>>
>>>>>>>> On Thu, May 4, 2023 at 8:25 AM Matthew Knepley <knep...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Thu, May 4, 2023 at 8:21 AM Mark Lohry <mlo...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Do they start very similarly and then slowly drift further apart?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yes, this. I take it this sounds familiar?
>>>>>>>>>>
>>>>>>>>>> See these two examples with 20 fixed iterations pasted at the
>>>>>>>>>> end. The difference for one solve is slight (final SNES norm is 
>>>>>>>>>> identical
>>>>>>>>>> to 5 digits), but in the context I'm using it in (repeated 
>>>>>>>>>> applications to
>>>>>>>>>> solve a steady state multigrid problem, though here just one level) 
>>>>>>>>>> the
>>>>>>>>>> differences add up such that I might reach global convergence in 35
>>>>>>>>>> iterations or 38. It's not the end of the world, but I was expecting 
>>>>>>>>>> that
>>>>>>>>>> with -np 1 these would be identical and I'm not sure where the root 
>>>>>>>>>> cause
>>>>>>>>>> would be.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> The initial KSP residual is different, so its the PC. Please send
>>>>>>>>> the output of -snes_view. If your ASM is using direct factorization, 
>>>>>>>>> then it
>>>>>>>>> could be randomness in whatever LU you are using.
>>>>>>>>>
>>>>>>>>>   Thanks,
>>>>>>>>>
>>>>>>>>>     Matt
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>   0 SNES Function norm 2.801842107848e+04
>>>>>>>>>>     0 KSP Residual norm 4.045639499595e+01
>>>>>>>>>>     1 KSP Residual norm 1.917999809040e+01
>>>>>>>>>>     2 KSP Residual norm 1.616048521958e+01
>>>>>>>>>> [...]
>>>>>>>>>>    19 KSP Residual norm 8.788043518111e-01
>>>>>>>>>>    20 KSP Residual norm 6.570851270214e-01
>>>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>>   1 SNES Function norm 1.801309983345e+03
>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Same system, identical initial 0 SNES norm, 0 KSP is slightly
>>>>>>>>>> different
>>>>>>>>>>
>>>>>>>>>>   0 SNES Function norm 2.801842107848e+04
>>>>>>>>>>     0 KSP Residual norm 4.045639473002e+01
>>>>>>>>>>     1 KSP Residual norm 1.917999883034e+01
>>>>>>>>>>     2 KSP Residual norm 1.616048572016e+01
>>>>>>>>>> [...]
>>>>>>>>>>    19 KSP Residual norm 8.788046348957e-01
>>>>>>>>>>    20 KSP Residual norm 6.570859588610e-01
>>>>>>>>>>   Linear solve converged due to CONVERGED_ITS iterations 20
>>>>>>>>>>   1 SNES Function norm 1.801311320322e+03
>>>>>>>>>> Nonlinear solve converged due to CONVERGED_ITS iterations 1
>>>>>>>>>>
>>>>>>>>>> On Wed, May 3, 2023 at 11:05 PM Barry Smith <bsm...@petsc.dev>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>   Do they start very similarly and then slowly drift further
>>>>>>>>>>> apart? That is the first couple of KSP iterations they are almost 
>>>>>>>>>>> identical
>>>>>>>>>>> but then for each iteration get a bit further. Similar for the SNES
>>>>>>>>>>> iterations, starting close and then for more iterations and more 
>>>>>>>>>>> solves
>>>>>>>>>>> they start moving apart. Or do they suddenly jump to be very 
>>>>>>>>>>> different? You
>>>>>>>>>>> can run with -snes_monitor -ksp_monitor
>>>>>>>>>>>
>>>>>>>>>>> On May 3, 2023, at 9:07 PM, Mark Lohry <mlo...@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> This is on a single MPI rank. I haven't checked the coloring,
>>>>>>>>>>> was just guessing there. But the solutions/residuals are slightly 
>>>>>>>>>>> different
>>>>>>>>>>> from run to run.
>>>>>>>>>>>
>>>>>>>>>>> Fair to say that for serial JFNK/asm ilu0/gmres we should expect
>>>>>>>>>>> bitwise identical results?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, May 3, 2023, 8:50 PM Barry Smith <bsm...@petsc.dev>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>   No, the coloring should be identical every time. Do you see
>>>>>>>>>>>> differences with 1 MPI rank? (Or much smaller ones?).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> > On May 3, 2023, at 8:42 PM, Mark Lohry <mlo...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> >
>>>>>>>>>>>> > I'm running multiple iterations of newtonls with an MFFD/JFNK
>>>>>>>>>>>> nonlinear solver where I give it the sparsity. PC asm, KSP gmres, 
>>>>>>>>>>>> with
>>>>>>>>>>>> SNESSetLagJacobian -2 (compute once and then frozen jacobian).
>>>>>>>>>>>> >
>>>>>>>>>>>> > I'm seeing slight (<1%) but nonzero differences in residuals
>>>>>>>>>>>> from run to run. I'm wondering where randomness might enter here 
>>>>>>>>>>>> -- does
>>>>>>>>>>>> the jacobian coloring use a random seed?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>>> experiments is infinitely more interesting than any results to which 
>>>>>>>>> their
>>>>>>>>> experiments lead.
>>>>>>>>> -- Norbert Wiener
>>>>>>>>>
>>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>>>
>>>>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

Re: [petsc-users] sources of floating point randomness in JFNK in serial

Reply via email to