Thank you, I will try BCGSL. And good to know that this is worth pursuing, and that it is possible. Step 1, I guess I should upgrade to the latest release on Petsc.
How can I make sure that I am "using an MPI that follows the suggestion for implementers about determinism"? I am using MPICH version 3.3a2. I am pretty sure that I'm assembling the same matrix every time, but I'm not sure how it would depend on 'how you do the communication'. Each process is doing a series of MatSetValues with INSERT_VALUES, assembling the matrix by rows. My understanding of this process is that it'd be deterministic. On Sat, Apr 1, 2023 at 9:05 PM Jed Brown <j...@jedbrown.org> wrote: > If you use unpreconditioned BCGS and ensure that you assemble the same > matrix (depends how you do the communication for that), I think you'll get > bitwise reproducible results when using an MPI that follows the suggestion > for implementers about determinism. Beyond that, it'll depend somewhat on > the preconditioner. > > If you like BCGS, you may want to try BCGSL, which has a longer memory and > tends to be more robust. But preconditioning is usually critical and the > place to devote most effort. > > Mark McClure <m...@resfrac.com> writes: > > > Hello, > > > > I have been a user of Petsc for quite a few years, though I haven't > updated > > my version in a few years, so it's possible that my comments below could > be > > 'out of date'. > > > > Several years ago, I'd asked you guys about reproducibility. I observed > > that if I gave an identical matrix to the Petsc linear solver, I would > get > > a bit-wise identical result back if running on one processor, but if I > ran > > with MPI, I would see differences at the final sig figs, below the > > convergence criterion. Even if rerunning the same exact calculation on > the > > same exact machine. > > > > Ie, with repeated tests, it was always converging to the same answer > > 'within convergence tolerance', but not consistent in the sig figs beyond > > the convergence tolerance. > > > > At the time, the response that this was unavoidable, and related to the > > issue that machine arithmetic is not commutative, and so the timing of > when > > processors were recombining information (which was random, effectively a > > race condition) was causing these differences. > > > > Am I remembering correctly? And, if so, is this still a property of the > > Petsc linear solver with MPI, and is there now any option available to > > resolve it? I would be willing to accept a performance hit in order to > get > > guaranteed bitwise consistency, even when running with MPI. > > > > I am using the solver KSPBCGS, without a preconditioner. This is the > > selection because several years ago, I did testing, and found that on the > > particular linear systems that I am usually working with, this solver > (with > > no preconditioner) was the most robust, in terms of consistently > > converging, and in terms of performance. Actually, I also tested a > variety > > of other linear solvers other than Petsc (including other implementations > > of BiCGStab), and found that the Petsc BCGS was the best performer. > Though, > > I'm curious, have there been updates to that algorithm in recent years, > > where I should consider updating to a newer Petsc build and comparing? > > > > Best regards, > > Mark McClure >