Quick update for the list: Matt and I were emailing back and forth a bit and I at least have a workaround for now.
It turns out that MVAPICH2 does include their own implementations of malloc/calloc . Matt believes (and I agree) that they should be private to the library though. It looks like something about the way Julia is loading the library is exposing those to PETSc. For now: I've worked around the issue by hardcore hacking the MVAPICH source to remove the definitions of malloc/calloc... and it DOES fix the "problem"... but, is definitely not the right answer. I'm going to talk to the Julia guys here at MIT tomorrow and see if I can get to the bottom of why those symbols are getting exposed when libmpi is getting loaded. Thanks Matt, for the help! Derek On Mon, Dec 5, 2016 at 11:56 PM Derek Gaston <[email protected]> wrote: > Please excuse the slightly off-topic post: but I'm pulling my hair out > here and I'm hoping someone else has seen this before. > > I'm calling PETSc from Julia and it's working great on my Mac with MPICH > but I'm seeing a segfault on my Linux cluster using MVAPICH2. I get the > same segfault both with the "official" PETSc.jl and my own smaller wrapper > MiniPETSc.jl: https://github.com/friedmud/MiniPETSc.jl > > Here is the stack trace I'm seeing: > > signal (11): Segmentation fault > while loading /home/gastdr/projects/falcon/julia_mpi.jl, in expression > starting on line 5 > _int_malloc at /home/gastdr/projects/falcon/root/lib/libmpi.so.12 (unknown > line) > calloc at /home/gastdr/projects/falcon/root/lib/libmpi.so.12 (unknown line) > PetscOptionsCreate at > /home/gastdr/projects/falcon/petsc-3.7.3/src/sys/objects/options.c:2578 > PetscInitialize at > /home/gastdr/projects/falcon/petsc-3.7.3/src/sys/objects/pinit.c:761 > PetscInitializeNoPointers at > /home/gastdr/projects/falcon/petsc-3.7.3/src/sys/objects/pinit.c:111 > __init__ at /home/gastdr/.julia/v0.5/MiniPETSc/src/MiniPETSc.jl:14 > > The script I'm running is simply just run.jl : > > using MiniPETSc > > It feels like libmpi is not quite loaded correctly yet. It does get > loaded by MPI.jl here: > https://github.com/JuliaParallel/MPI.jl/blob/master/src/MPI.jl#L29 and > I've verified that that code is running before PETSc is being initialized. > > It looks ok to me... and I've tried a few variations on that dlopen() call > and nothing makes it better. > > BTW: MPI.jl is working fine on its own. I can write pure MPI Julia apps > and run them in parallel on the cluster. Just need to get this > initialization of PETSc straightened out. > > Thanks for any help! > > Derek >
