Send the configure.log and make.log for the two system configurations that produce very different results as well as the output running with -dm_view -info for both runs. The cause is likely not subtle, one is likely using metis and the other is likely just not using any partitioner.
> On Sep 15, 2019, at 6:07 PM, Matthew Knepley via petsc-users > <petsc-users@mcs.anl.gov> wrote: > > On Sun, Sep 15, 2019 at 6:59 PM Danyang Su <danyang...@gmail.com> wrote: > Hi Matt, > > Thanks for the quick reply. I have no change in the adjacency. The source > code and the simulation input files are all the same. I also tried to use GNU > compiler and mpich with petsc 3.11.3 and it works fine. > > It looks like the problem is caused by the difference in configuration. > However, the configuration is pretty the same as petsc 3.9.3 except the > compiler and mpi used. I will contact scinet staff to check if they have any > idea on this. > > Very very strange since the partition is handled completely by Metis, and > does not use MPI. > > Thanks, > > Matt > > Thanks, > > Danyang > > On September 15, 2019 3:20:18 p.m. PDT, Matthew Knepley <knep...@gmail.com> > wrote: > On Sun, Sep 15, 2019 at 5:19 PM Danyang Su via petsc-users > <petsc-users@mcs.anl.gov> wrote: > Dear All, > > I have a question regarding strange partition problem in PETSc 3.11 version. > The problem does not exist on my local workstation. However, on a cluster > with different PETSc versions, the partition seems quite different, as you > can find in the figure below, which is tested with 160 processors. The color > means the processor owns that subdomain. In this layered prism mesh, there > are 40 layers from bottom to top and each layer has around 20k nodes. The > natural order of nodes is also layered from bottom to top. > > The left partition (PETSc 3.10 and earlier) looks good with minimum number of > ghost nodes while the right one (PETSc 3.11) looks weired with huge number of > ghost nodes. Looks like the right one uses partition layer by layer. This > problem exists on a a cluster but not on my local workstation for the same > PETSc version (with different compiler and MPI). Other than the difference in > partition and efficiency, the simulation results are the same. > > > > > Below is PETSc configuration on three machine: > > Local workstation (works fine): ./configure --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-mpich --download-scalapack --download-parmetis > --download-metis --download-ptscotch --download-fblaslapack --download-hypre > --download-superlu_dist --download-hdf5=yes --download-ctetgen > --with-debugging=0 COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 > --with-cxx-dialect=C++11 > > Cluster with PETSc 3.9.3 (works fine): > --prefix=/scinet/niagara/software/2018a/opt/intel-2018.2-intelmpi-2018.2/petsc/3.9.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native > -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" > --download-chaco=1 --download-hypre=1 --download-metis=1 --download-ml=1 > --download-mumps=1 --download-parmetis=1 --download-plapack=1 > --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 > --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 > --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-debugging=0 --with-hdf5=1 > --with-mkl_pardiso-dir=/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/niagara/intel/2018.2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > > Cluster with PETSc 3.11.3 (looks weired): > --prefix=/scinet/niagara/software/2019b/opt/intel-2019u4-intelmpi-2019u4/petsc/3.11.3 > CC=mpicc CXX=mpicxx F77=mpif77 F90=mpif90 FC=mpifc COPTFLAGS="-march=native > -O2" CXXOPTFLAGS="-march=native -O2" FOPTFLAGS="-march=native -O2" > --download-chaco=1 --download-hdf5=1 --download-hypre=1 --download-metis=1 > --download-ml=1 --download-mumps=1 --download-parmetis=1 --download-plapack=1 > --download-prometheus=1 --download-ptscotch=1 --download-scotch=1 > --download-sprng=1 --download-superlu=1 --download-superlu_dist=1 > --download-triangle=1 --with-avx512-kernels=1 > --with-blaslapack-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-cxx-dialect=C++11 --with-debugging=0 > --with-mkl_pardiso-dir=/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl > --with-scalapack=1 > --with-scalapack-lib="[/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_scalapack_lp64.so,/scinet/intel/2019u4/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64/libmkl_blacs_intelmpi_lp64.so]" > --with-x=0 > > And the partition is used by default dmplex distribution. > > !c distribute mesh over processes > call DMPlexDistribute(dmda_flow%da,stencil_width, & > PETSC_NULL_SF, & > PETSC_NULL_OBJECT, & > distributedMesh,ierr) > CHKERRQ(ierr) > > Any idea on this strange problem? > > > I just looked at the code. Your mesh should be partitioned by k-way > partitioning using Metis since its on 1 proc for partitioning. This code > is the same for 3.9 and 3.11, and you get the same result on your machine. I > cannot understand what might be happening on your cluster > (MPI plays no role). Is it possible that you changed the adjacency > specification in that version? > > Thanks, > > Matt > Thanks, > > Danyang > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/ > > -- > Sent from my Android device with K-9 Mail. Please excuse my brevity. > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener > > https://www.cse.buffalo.edu/~knepley/