An old issue with SF_Window is at https://gitlab.com/petsc/petsc/-/issues/555, though which is a different error.
--Junchao Zhang On Sun, Sep 12, 2021 at 2:20 PM Junchao Zhang <junchao.zh...@gmail.com> wrote: > We met SF + Windows errors before. Stefano wrote the code, which I don't > think was worth doing. SF with MPI one-sided is hard to be correct (due to > shared memory programming), bad in performance, and no users use that. > I would suggest we just disable the test and feature? Stefano, what do > you think? > > --Junchao Zhang > > > On Sun, Sep 12, 2021 at 2:10 PM Pierre Jolivet <pie...@joliv.et> wrote: > >> >> On 12 Sep 2021, at 8:56 PM, Matthew Knepley <knep...@gmail.com> wrote: >> >> On Sun, Sep 12, 2021 at 2:49 PM Antonio T. sagitter < >> sagit...@fedoraproject.org> wrote: >> >>> Those attached are configure.log/make.log from a MPI build in Fedora 34 >>> x86_64 where the error below occurred. >>> >> >> This is OpenMPI 4.1.0. Is that the only MPI you build? My first >> inclination is that this is an MPI implementation bug. >> >> Junchao, do we have an OpenMPI build in the CI? >> >> >> config/examples/arch-ci-linux-cuda-double-64idx.py: >> '--download-openmpi=1', >> config/examples/arch-ci-linux-pkgs-dbg-ftn-interfaces.py: >> '--download-openmpi=1', >> config/examples/arch-ci-linux-pkgs-opt.py: '--download-openmpi=1', >> >> config/BuildSystem/config/packages/OpenMPI.py uses version 4.1.0 as well. >> I’m not sure PETSc is to blame here Antonio. You may want to try to ditch >> the OpenMPI shipped by your packet manager and try --download-openmpi as >> well, just for a quick sanity check. >> >> Thanks, >> Pierre >> >> Thanks, >> >> Matt >> >> >>> On 9/12/21 19:18, Antonio T. sagitter wrote: >>> > Okay. I will try to set correctly the DATAFILESPATH options. >>> > >>> > I see even this error: >>> > >>> > not ok >>> > vec_is_sf_tutorials-ex1_4+sf_window_sync-fence_sf_window_flavor-create >>> # >>> > Error code: 68 >>> > >>> > # PetscSF Object: 4 MPI processes >>> > >>> > # type: window >>> > >>> > # [0] Number of roots=3, leaves=2, remote ranks=2 >>> > >>> > # [0] 0 <- (3,1) >>> > >>> > # [0] 1 <- (1,0) >>> > >>> > # [1] Number of roots=2, leaves=3, remote ranks=2 >>> > >>> > # [1] 0 <- (0,1) >>> > >>> > # [1] 1 <- (2,0) >>> > >>> > # [1] 2 <- (0,2) >>> > >>> > # [2] Number of roots=2, leaves=3, remote ranks=3 >>> > >>> > # [2] 0 <- (1,1) >>> > >>> > # [2] 1 <- (3,0) >>> > >>> > # [2] 2 <- (0,2) >>> > >>> > # [3] Number of roots=2, leaves=3, remote ranks=2 >>> > >>> > # [3] 0 <- (2,1) >>> > >>> > # [3] 1 <- (0,0) >>> > >>> > # [3] 2 <- (0,2) >>> > >>> > # [0] Roots referenced by my leaves, by rank >>> > >>> > # [0] 1: 1 edges >>> > >>> > # [0] 1 <- 0 >>> > >>> > # [0] 3: 1 edges >>> > >>> > # [0] 0 <- 1 >>> > >>> > # [1] Roots referenced by my leaves, by rank >>> > >>> > # [1] 0: 2 edges >>> > >>> > # [1] 0 <- 1 >>> > >>> > # [1] 2 <- 2 >>> > >>> > # [1] 2: 1 edges >>> > >>> > # [1] 1 <- 0 >>> > >>> > # [2] Roots referenced by my leaves, by rank >>> > >>> > # [2] 0: 1 edges >>> > >>> > # [2] 2 <- 2 >>> > >>> > # [2] 1: 1 edges >>> > >>> > # [2] 0 <- 1 >>> > >>> > # [2] 3: 1 edges >>> > >>> > # [2] 1 <- 0 >>> > >>> > # [3] Roots referenced by my leaves, by rank >>> > >>> > # [3] 0: 2 edges >>> > >>> > # [3] 1 <- 0 >>> > >>> > # [3] 2 <- 2 >>> > >>> > # [3] 2: 1 edges >>> > >>> > # [3] 0 <- 1 >>> > >>> > # current flavor=CREATE synchronization=FENCE MultiSF >>> sort=rank-order >>> > >>> > # current info=MPI_INFO_NULL >>> > >>> > # [buildhw-x86-09:1135574] *** An error occurred in MPI_Accumulate >>> > >>> > # [buildhw-x86-09:1135574] *** reported by process [3562602497,3] >>> > >>> > # [buildhw-x86-09:1135574] *** on win rdma window 4 >>> > >>> > # [buildhw-x86-09:1135574] *** MPI_ERR_RMA_RANGE: invalid RMA >>> address >>> > range >>> > >>> > # [buildhw-x86-09:1135574] *** MPI_ERRORS_ARE_FATAL (processes in >>> > this win will now abort, >>> > >>> > # [buildhw-x86-09:1135574] *** and potentially your MPI job) >>> > >>> > # [buildhw-x86-09.iad2.fedoraproject.org:1135567] 3 more processes >>> > have sent help message help-mpi-errors.txt / mpi_errors_are_fatal >>> > >>> > # [buildhw-x86-09.iad2.fedoraproject.org:1135567] Set MCA >>> parameter >>> > "orte_base_help_aggregate" to 0 to see all help / error messages >>> > >>> > Looks like an error related to OpenMPI-4*: >>> > https://github.com/open-mpi/ompi/issues/6374 >>> > >>> >>> -- >>> --- >>> Antonio Trande >>> Fedora Project >>> mailto: sagit...@fedoraproject.org >>> GPG key: 0x29FBC85D7A51CC2F >>> GPG key server: https://keyserver1.pgp.com/ >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> https://www.cse.buffalo.edu/~knepley/ >> <http://www.cse.buffalo.edu/~knepley/> >> >> >>