On Thu, Mar 21, 2019 at 1:57 PM Derek Gaston via petsc-users 
<petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> wrote:
It sounds like you already tracked this down... but for completeness here is 
what track-origins gives:

==262923== Conditional jump or move depends on uninitialised value(s)
==262923==    at 0x73C6548: VecScatterMemcpyPlanCreate_Index (vscat.c:294)
==262923==    by 0x73DBD97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312)
==262923==    by 0x73DE6AE: VecScatterCreateCommon_PtoS_MPI1 
(vpscat_mpi1.c:2328)
==262923==    by 0x73DFFEA: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2202)
==262923==    by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608)
==262923==    by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857)
==262923==    by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
==262923==    by 0x7413D39: VecScatterSetUp (vscatfce.c:212)
==262923==    by 0x7412D73: VecScatterCreateWithData (vscreate.c:333)
==262923==    by 0x747A232: VecCreateGhostWithArray (pbvec.c:685)
==262923==    by 0x747A90D: VecCreateGhost (pbvec.c:741)
==262923==    by 0x5C7FFD6: libMesh::PetscVector<double>::init(unsigned long, 
unsigned long, std::vector<unsigned long, std::allocator<unsigned long> > 
const&, bool, libMesh::ParallelType) (petsc_vector.h:752)
==262923==  Uninitialised value was created by a heap allocation

I checked the code but could not figure out what was wrong.  Perhaps you should 
use 64-bit integers and see whether the warning still exists.  Please remember 
to incorporate Stefano's bug fix.

==262923==    at 0x402DDC6: memalign (vg_replace_malloc.c:899)
==262923==    by 0x7359702: PetscMallocAlign (mal.c:41)
==262923==    by 0x7359C70: PetscMallocA (mal.c:390)
==262923==    by 0x73DECF0: VecScatterCreateLocal_PtoS_MPI1 (vpscat_mpi1.c:2061)
==262923==    by 0x73C7A51: VecScatterCreate_PtoS (vscat.c:608)
==262923==    by 0x73C9E8A: VecScatterSetUp_vectype_private (vscat.c:857)
==262923==    by 0x73CBE5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
==262923==    by 0x7413D39: VecScatterSetUp (vscatfce.c:212)
==262923==    by 0x7412D73: VecScatterCreateWithData (vscreate.c:333)
==262923==    by 0x747A232: VecCreateGhostWithArray (pbvec.c:685)
==262923==    by 0x747A90D: VecCreateGhost (pbvec.c:741)
==262923==    by 0x5C7FFD6: libMesh::PetscVector<double>::init(unsigned long, 
unsigned long, std::vector<unsigned long, std::allocator<unsigned long> > 
const&, bool, libMesh::ParallelType) (petsc_vector.h:752)


BTW: This turned out not to be my actual problem.  My actual problem was just 
some stupidity on my part... just a simple input parameter issue to my code 
(should have had better error checking!).

But: It sounds like my digging may have uncovered something real here... so it 
wasn't completely useless :-)

Thanks for your help everyone!

Derek



On Thu, Mar 21, 2019 at 10:38 AM Stefano Zampini 
<stefano.zamp...@gmail.com<mailto:stefano.zamp...@gmail.com>> wrote:


Il giorno mer 20 mar 2019 alle ore 23:40 Derek Gaston via petsc-users 
<petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>> ha scritto:
Trying to track down some memory corruption I'm seeing on larger scale runs 
(3.5B+ unknowns).

Uhm.... are you using 32bit indices? is it possible there's integer overflow 
somewhere?


Was able to run Valgrind on it... and I'm seeing quite a lot of uninitialized 
value errors coming from ghost updating.  Here are some of the traces:

==87695== Conditional jump or move depends on uninitialised value(s)
==87695==    at 0x73236D3: PetscMallocAlign (mal.c:28)
==87695==    by 0x7323C70: PetscMallocA (mal.c:390)
==87695==    by 0x739048E: VecScatterMemcpyPlanCreate_Index (vscat.c:284)
==87695==    by 0x73A5D97: VecScatterMemcpyPlanCreate_PtoP (vpscat_mpi1.c:312)
==64730==    by 0x7393E8A: VecScatterSetUp_vectype_private (vscat.c:857)
==64730==    by 0x7395E5D: VecScatterSetUp_MPI1 (vpscat_mpi1.c:2543)
==64730==    by 0x73DDD39: VecScatterSetUp (vscatfce.c:212)
==64730==    by 0x73DCD73: VecScatterCreateWithData (vscreate.c:333)
==64730==    by 0x7444232: VecCreateGhostWithArray (pbvec.c:685)
==64730==    by 0x744490D: VecCreateGhost (pbvec.c:741)

==133582== Conditional jump or move depends on uninitialised value(s)
==133582==    at 0x4030384: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:1034)
==133582==    by 0x739E4F9: PetscMemcpy (petscsys.h:1649)
==133582==    by 0x739E4F9: VecScatterMemcpyPlanExecute_Pack 
(vecscatterimpl.h:150)
==133582==    by 0x739E4F9: VecScatterBeginMPI1_1 (vpscat_mpi1.h:69)
==133582==    by 0x73DD964: VecScatterBegin (vscatfce.c:110)
==133582==    by 0x744E195: VecGhostUpdateBegin (commonmpvec.c:225)

This is from a Git checkout of PETSc... the hash I branched from is: 
0e667e8fea4aa from December 23rd (updating would be really hard at this point 
as I've completed 90% of my dissertation with this version... and changing 
PETSc now would be pretty painful!).

Any ideas?  Is it possible it's in my code?  Is it possible that there are 
later PETSc commits that already fix this?

Thanks for any help,
Derek



--
Stefano

Reply via email to