> On Sep 18, 2019, at 12:25 PM, Mark Lohry via petsc-users 
> <petsc-users@mcs.anl.gov> wrote:
> 
> Mark,
>  

    Mark,

      Good point. This has been a big headache forever

      Note that this has been "fixed" in the master version of PETSc and will 
be in its next release. If you use --download-parmetis in the future it will 
use the same random numbers on all machines and thus should produce the same 
partitions on all machines. 

       I think that metis has aways used the same random numbers and all 
machines and thus always produced the same results.

    Barry


> The machine, compiler and MPI version should not matter.
> 
> I might have missed something earlier in the thread, but parmetis has a 
> dependency on the machine's glibc srand, and it can (and does) create 
> different partitions with different srand versions. The same mesh on the same 
> code on the same process count can and will give different partitions 
> (possibly bad ones) on different machines.
> 
> On Tue, Sep 17, 2019 at 1:05 PM Mark Adams via petsc-users 
> <petsc-users@mcs.anl.gov> wrote:
> 
> 
> On Tue, Sep 17, 2019 at 12:53 PM Danyang Su <danyang...@gmail.com> wrote:
> Hi Mark,
> 
> Thanks for your follow-up. 
> 
> The unstructured grid code has been verified and there is no problem in the 
> results. The convergence rate is also good. The 3D mesh is not good, it is 
> based on the original stratum which I haven't refined, but good for initial 
> test as it is relative small and the results obtained from this mesh still 
> makes sense.
> 
> The 2D meshes are just for testing purpose as I want to reproduce the 
> partition problem on a cluster using PETSc3.11.3 and Intel2019. 
> Unfortunately, I didn't find problem using this example. 
> 
> The code has no problem in using different PETSc versions (PETSc V3.4 to 
> V3.11)
> 
> OK, it is the same code. I thought I saw something about your code changing.
> 
> Just to be clear, v3.11 never gives you good partitions. It is not just a 
> problem on this Intel cluster.
> 
> The machine, compiler and MPI version should not matter.
>  
> and MPI distribution (MPICH, OpenMPI, IntelMPI), except for one simulation 
> case (the mesh I attached) on a cluster with PETSc3.11.3 and Intel2019u4 due 
> to the very different partition compared to PETSc3.9.3. Yet the simulation 
> results are the same except for the efficiency problem because the strange 
> partition results into much more communication (ghost nodes).
> 
> I am still trying different compiler and mpi with PETSc3.11.3 on that cluster 
> to trace the problem. Will get back to you guys when there is update.
> 
> 
> This is very strange. You might want to use 'git bisect'. You set a good and 
> a bad SHA1 (we can give you this for 3.9 and 3.11 and the exact commands). 
> The git will go to a version in the middle. You then reconfigure, remake, 
> rebuild your code, run your test. Git will ask you, as I recall, if the 
> version is good or bad. Once you get this workflow going it is not too bad, 
> depending on how hard this loop is of course.
>  
> Thanks,
> 
> danyang
> 

Reply via email to