Thanks for the help Matthew.
The next step would be to verify that the operators you get in those
slots are correct in parallel. I normally just print out the matrix
for a small problem. This is the only way I know to debug assembly.
To check the solver, you set it to an exact solver and see that it
takes 1 iterate. That means using LU as the subsolver
for the first block, using the full Schur complement, and solving the
Schur system with a residual tolerance of something like 1e-10.
There is certainly something wrong in the nested matrix, as this is the
output in serial (LU + full Schur LU):
0 SNES Function norm 3.240309155660e+03
0 KSP Residual norm 3.240309155660e+03
1 KSP Residual norm 3.219893365194e-08
1 SNES Function norm 3.999872813081e+03
0 KSP Residual norm 1.214145680994e+04
1 KSP Residual norm 2.742817643080e-04
2 SNES Function norm 3.663148395735e+01
0 KSP Residual norm 3.980103473350e+03
1 KSP Residual norm 1.814754406828e-05
3 SNES Function norm 2.136252100504e+01
0 KSP Residual norm 6.970609974245e+01
1 KSP Residual norm 1.982821753925e-05
4 SNES Function norm 1.059694009734e-02
0 KSP Residual norm 2.136131725219e+01
1 KSP Residual norm 4.545878835451e-06
2 KSP Residual norm 8.841427426699e-08
3 KSP Residual norm 3.305898584949e-09
5 SNES Function norm 3.729120062850e-03
0 KSP Residual norm 3.846880232871e-03
1 KSP Residual norm 1.740268533385e-06
2 KSP Residual norm 3.013108797112e-08
3 KSP Residual norm 1.105602265959e-09
6 SNES Function norm 1.117697448247e-09
whereas in parallel it is this (two processes):
0 SNES Function norm 3.240309155660e+03
0 KSP Residual norm 3.240309155660e+03
1 KSP Residual norm 2.158162432432e-01
2 KSP Residual norm 2.156121452805e-01
3 KSP Residual norm 2.148639482595e-01
4 KSP Residual norm 2.148248193325e-01
5 KSP Residual norm 1.305140166188e-03
Linear solve converged due to CONVERGED_RTOL iterations 5
1 SNES Function norm 4.774611203034e+42
Best,
NB