Re: [OMPI devel] OpenMPI 2.0 and Petsc 3.7.2

2016-07-25 Thread Nathan Hjelm
It looks to me like double free on both send and receive requests. The receive 
free is an extra OBJ_RELEASE of MPI_DOUBLE which was not malloced (invalid 
free). The send free is an assert failure in OBJ_RELEASE of an OBJ_NEW() object 
(invalid magic). I plan to look at in in the next couple of days. Let me know 
if you figure it out before I get to it.

-Nathan

> On Jul 25, 2016, at 8:38 PM, Gilles Gouaillardet  wrote:
> 
> Eric,
> 
> where can your test case be downloaded ? how many nodes and tasks do you need 
> to reproduce the bug ?
> 
> fwiw, currently there are two Open MPI repositories
> - https://github.com/open-mpi/ompi
>  there is only one branch and is the 'master' branch, today, this can be seen 
> as Open MPI 3.0 pre alpha
> - https://github.com/open-mpi/ompi-release
>  the default branch is 'v2.x', today, this can be seen as Open MPI 2.0.1 pre 
> alpha
> 
> Cheers,
> 
> Gilles
> 
> On 7/26/2016 3:33 AM, Eric Chamberland wrote:
>> Hi,
>> 
>> has someone tried OpenMPI 2.0 with Petsc 3.7.2?
>> 
>> I am having some errors with petsc, maybe someone have them too?
>> 
>> Here are the configure logs for PETSc:
>> 
>> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log
>>  
>> 
>> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log
>>  
>> 
>> And for OpenMPI:
>> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log
>>  
>> 
>> (in fact, I am testing the ompi-release branch, a sort of petsc-master 
>> branch, since I need the commit 9ba6678156).
>> 
>> For a set of parallel tests, I have 104 that works on 124 total tests.
>> 
>> And the typical error:
>> *** Error in 
>> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev':
>>  free(): invalid pointer:
>> === Backtrace: =
>> /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
>> /lib64/libc.so.6(+0x78026)[0x7f80eb11c026]
>> /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]
>> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60] 
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6] 
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334]
>>  
>> 
>> a similar one:
>> *** Error in 
>> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev':
>>  free(): invalid pointer: 0x7f382a7c5bc0 ***
>> === Backtrace: =
>> /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]
>> /lib64/libc.so.6(+0x78026)[0x7f3829f22026]
>> /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]
>> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60] 
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6] 
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45]
>>  
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755]
>>  
>> /op

Re: [OMPI devel] OpenMPI 2.0 and Petsc 3.7.2

2016-07-25 Thread Gilles Gouaillardet

Eric,

where can your test case be downloaded ? how many nodes and tasks do you 
need to reproduce the bug ?


fwiw, currently there are two Open MPI repositories
- https://github.com/open-mpi/ompi
  there is only one branch and is the 'master' branch, today, this can 
be seen as Open MPI 3.0 pre alpha

- https://github.com/open-mpi/ompi-release
  the default branch is 'v2.x', today, this can be seen as Open MPI 
2.0.1 pre alpha


Cheers,

Gilles

On 7/26/2016 3:33 AM, Eric Chamberland wrote:

Hi,

has someone tried OpenMPI 2.0 with Petsc 3.7.2?

I am having some errors with petsc, maybe someone have them too?

Here are the configure logs for PETSc:

http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log 



http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log 



And for OpenMPI:
http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log 



(in fact, I am testing the ompi-release branch, a sort of petsc-master 
branch, since I need the commit 9ba6678156).


For a set of parallel tests, I have 104 that works on 124 total tests.

And the typical error:
*** Error in 
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': 
free(): invalid pointer:

=== Backtrace: =
/lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
/lib64/libc.so.6(+0x78026)[0x7f80eb11c026]
/lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]
/opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60] 


/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334] 



a similar one:
*** Error in 
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev': 
free(): invalid pointer: 0x7f382a7c5bc0 ***

=== Backtrace: =
/lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]
/lib64/libc.so.6(+0x78026)[0x7f3829f22026]
/lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]
/opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60] 


/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9] 

/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334] 



another one:

*** Error in 
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev': 
free(): invalid pointer: 0x7f67b6d37bc0 ***

=== Backtrace: =
/lib64/libc.so.6(+0x7277f)[0x7f67b648e77f]
/lib64/libc.so.6(+0x78026)[0x7f67b6494026]
/lib64/libc.so.6(+0x78d53)[0x7f67b6494d53]
/opt/openmpi-2.x_opt/lib/libop

Re: [OMPI devel] PGI built Open MPI vs GNU built slurm

2016-07-25 Thread Gilles Gouaillardet

Paul,

in my environment, libslurm.la contains

# Linker flags that can not go in dependency_libs.
inherited_linker_flags=' -pthread'

# Libraries that this one depends upon.
dependency_libs=' -ldl -lpthread'


so bottom line, it invokes the compiler with both -pthread and -lpthread


iirc, -pthread does two things :

- invoke the compiler with -D_REENTRANT (so it uses the thread-safe 
errno and so on)


- invoke the linker with -lpthread

OpenMPI has its own way to pass -D_REENTRANT or similar anyway, and 
libslurm.la is used only for for linking.


since -lpthread is pulled anyway from libslurm.la (or it was already set 
by OpenMPI), then yes, discarding -pthread should do the trick.



Cheers,


Gilles


On 7/26/2016 10:11 AM, Paul Hargrove wrote:

Gilles,

My initial thought is that libslurm probably does require linking 
libpthread, either for for linking pthread_* symbols, or for proper 
*operation* (such as thread-safe versions of functions which override 
weak definitions in libc).


If so, then neither omitting "-pthread" nor telling pgcc not to 
complain about "-pthread" is going to be a good solution.

Instead the "-pthread" needs to be replaced by "-lpthread", or similar.

-Paul

On Mon, Jul 25, 2016 at 6:03 PM, Gilles Gouaillardet 
mailto:gil...@rist.or.jp>> wrote:


Folks,


This is a followup of a thread that initially started at
http://www.open-mpi.org/community/lists/users/2016/07/29635.php


The user is trying to build Open MPI with PGI compiler and
libslurm.la/libpmi.la  support, and
slurm was built with gcc compiler.


At first, it fails because the "-pthread" flag is pulled from
libslurm.la/libpmi.la , but this
flag is not supported by PGI compilers.

A workaround is to pass the -noswitcherror flag to the PGI
compiler (so the -pthread flag is discarded and a warning message
is issued, but PGI compiler does not fail). Unfortunatly, that
does not work because libtool does does not pass this flag to the
PGI compiler.


Of course, one option is to tell the user to rebuild slurm with
PGI, so libslurm.la/libpmi.la  do
not have the "-pthread" flag.

A nicer though arguable option is to hack libtool to silently drop
the "-pthread" flag with PGI compiler is used (i made a proof of
concept, and this is a two lines patch).

An other cleaner option is to hack libtool so it pass
-noswitcherror to PGI compiler, but i do not know how to achieve this.


Any thoughts ?


Cheers

___
devel mailing list
de...@open-mpi.org 
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/07/19278.php




--
Paul H. Hargrove phhargr...@lbl.gov 
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/07/19279.php




Re: [OMPI devel] PGI built Open MPI vs GNU built slurm

2016-07-25 Thread Paul Hargrove
Gilles,

My initial thought is that libslurm probably does require linking
libpthread, either for for linking pthread_* symbols, or for proper
*operation* (such as thread-safe versions of functions which override weak
definitions in libc).

If so, then neither omitting "-pthread" nor telling pgcc not to complain
about "-pthread" is going to be a good solution.
Instead the "-pthread" needs to be replaced by "-lpthread", or similar.

-Paul

On Mon, Jul 25, 2016 at 6:03 PM, Gilles Gouaillardet 
wrote:

> Folks,
>
>
> This is a followup of a thread that initially started at
> http://www.open-mpi.org/community/lists/users/2016/07/29635.php
>
>
> The user is trying to build Open MPI with PGI compiler and
> libslurm.la/libpmi.la support, and slurm was built with gcc compiler.
>
>
> At first, it fails because the "-pthread" flag is pulled from
> libslurm.la/libpmi.la, but this flag is not supported by PGI compilers.
>
> A workaround is to pass the -noswitcherror flag to the PGI compiler (so
> the -pthread flag is discarded and a warning message is issued, but PGI
> compiler does not fail). Unfortunatly, that does not work because libtool
> does does not pass this flag to the PGI compiler.
>
>
> Of course, one option is to tell the user to rebuild slurm with PGI, so
> libslurm.la/libpmi.la do not have the "-pthread" flag.
>
> A nicer though arguable option is to hack libtool to silently drop the
> "-pthread" flag with PGI compiler is used (i made a proof of concept, and
> this is a two lines patch).
>
> An other cleaner option is to hack libtool so it pass -noswitcherror to
> PGI compiler, but i do not know how to achieve this.
>
>
> Any thoughts ?
>
>
> Cheers
>
> ___
> devel mailing list
> de...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2016/07/19278.php
>



-- 
Paul H. Hargrove  phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900


[OMPI devel] PGI built Open MPI vs GNU built slurm

2016-07-25 Thread Gilles Gouaillardet

Folks,


This is a followup of a thread that initially started at 
http://www.open-mpi.org/community/lists/users/2016/07/29635.php



The user is trying to build Open MPI with PGI compiler and 
libslurm.la/libpmi.la support, and slurm was built with gcc compiler.



At first, it fails because the "-pthread" flag is pulled from 
libslurm.la/libpmi.la, but this flag is not supported by PGI compilers.


A workaround is to pass the -noswitcherror flag to the PGI compiler (so 
the -pthread flag is discarded and a warning message is issued, but PGI 
compiler does not fail). Unfortunatly, that does not work because 
libtool does does not pass this flag to the PGI compiler.



Of course, one option is to tell the user to rebuild slurm with PGI, so 
libslurm.la/libpmi.la do not have the "-pthread" flag.


A nicer though arguable option is to hack libtool to silently drop the 
"-pthread" flag with PGI compiler is used (i made a proof of concept, 
and this is a two lines patch).


An other cleaner option is to hack libtool so it pass -noswitcherror to 
PGI compiler, but i do not know how to achieve this.



Any thoughts ?


Cheers



Re: [OMPI devel] [petsc-users] OpenMPI 2.0 and Petsc 3.7.2

2016-07-25 Thread Eric Chamberland

Ok,

here is the 2 points answered:

#1) got valgrind output... here is the fatal free operation:

==107156== Invalid free() / delete / delete[] / realloc()
==107156==at 0x4C2A37C: free (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)

==107156==by 0x1E63CD5F: opal_free (malloc.c:184)
==107156==by 0x27622627: mca_pml_ob1_recv_request_fini 
(pml_ob1_recvreq.h:133)
==107156==by 0x27622C4F: mca_pml_ob1_recv_request_free 
(pml_ob1_recvreq.c:90)

==107156==by 0x1D3EF9DC: ompi_request_free (request.h:362)
==107156==by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
==107156==by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219)
==107156==by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
==107156==by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
==107156==by 0x14A33809: VecDestroy (vector.c:432)
==107156==by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) 
(girefConfigurationPETSc.h:115)
==107156==by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() 
(VecteurPETSc.cc:2292)
==107156==by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() 
(VecteurPETSc.cc:287)
==107156==by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() 
(VecteurPETSc.cc:281)
==107156==by 0x1135A57B: 
PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216)
==107156==by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in 
/home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)

==107156==by 0x435702: main (Test.ProblemeGD.icc:381)
==107156==  Address 0x1d6acbc0 is 0 bytes inside data symbol 
"ompi_mpi_double"
--107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to 
0x4c2f330 (__GI_stpcpy)

==107156==
==107156== Process terminating with default action of signal 6 
(SIGABRT): dumping core

==107156==at 0x1DD520C7: raise (in /lib64/libc-2.19.so)
==107156==by 0x1DD53534: abort (in /lib64/libc-2.19.so)
==107156==by 0x1DD4B145: __assert_fail_base (in /lib64/libc-2.19.so)
==107156==by 0x1DD4B1F1: __assert_fail (in /lib64/libc-2.19.so)
==107156==by 0x27626D12: mca_pml_ob1_send_request_fini 
(pml_ob1_sendreq.h:221)
==107156==by 0x276274C9: mca_pml_ob1_send_request_free 
(pml_ob1_sendreq.c:117)

==107156==by 0x1D3EF9DC: ompi_request_free (request.h:362)
==107156==by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
==107156==by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225)
==107156==by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
==107156==by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
==107156==by 0x14A33809: VecDestroy (vector.c:432)
==107156==by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) 
(girefConfigurationPETSc.h:115)
==107156==by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() 
(VecteurPETSc.cc:2292)
==107156==by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() 
(VecteurPETSc.cc:287)
==107156==by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() 
(VecteurPETSc.cc:281)
==107156==by 0x1135A57B: 
PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216)
==107156==by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in 
/home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)

==107156==by 0x435702: main (Test.ProblemeGD.icc:381)


#2) For the run with -vecscatter_alltoall it works...!

As an "end user", should I ever modify these VecScatterCreate options? 
How do they change the performances of the code on large problems?


Thanks,

Eric

On 25/07/16 02:57 PM, Matthew Knepley wrote:

On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland
mailto:eric.chamberl...@giref.ulaval.ca>> wrote:

Hi,

has someone tried OpenMPI 2.0 with Petsc 3.7.2?

I am having some errors with petsc, maybe someone have them too?

Here are the configure logs for PETSc:


http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log


http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log

And for OpenMPI:

http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log

(in fact, I am testing the ompi-release branch, a sort of
petsc-master branch, since I need the commit 9ba6678156).

For a set of parallel tests, I have 104 that works on 124 total tests.


It appears that the fault happens when freeing the VecScatter we build
for MatMult, which contains Request structures
for the ISends and  IRecvs. These looks like internal OpenMPI errors to
me since the Request should be opaque.
I would try at least two things:

1) Run under valgrind.

2) Switch the VecScatter implementation. All the options are here,

  
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate

but maybe use alltoall.

  Thanks,

 Matt


And the typical error:
*** Error in

`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev':
free(): invalid pointer:
=== Backtrace: =
/lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
/lib64/libc.so.6(+0x780

[OMPI devel] OpenMPI 2.0 and Petsc 3.7.2

2016-07-25 Thread Eric Chamberland

Hi,

has someone tried OpenMPI 2.0 with Petsc 3.7.2?

I am having some errors with petsc, maybe someone have them too?

Here are the configure logs for PETSc:

http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log

http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log

And for OpenMPI:
http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log

(in fact, I am testing the ompi-release branch, a sort of petsc-master 
branch, since I need the commit 9ba6678156).


For a set of parallel tests, I have 104 that works on 124 total tests.

And the typical error:
*** Error in 
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': 
free(): invalid pointer:

=== Backtrace: =
/lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
/lib64/libc.so.6(+0x78026)[0x7f80eb11c026]
/lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]
/opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334]

a similar one:
*** Error in 
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev': 
free(): invalid pointer: 0x7f382a7c5bc0 ***

=== Backtrace: =
/lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]
/lib64/libc.so.6(+0x78026)[0x7f3829f22026]
/lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]
/opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334]

another one:

*** Error in 
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev': 
free(): invalid pointer: 0x7f67b6d37bc0 ***

=== Backtrace: =
/lib64/libc.so.6(+0x7277f)[0x7f67b648e77f]
/lib64/libc.so.6(+0x78026)[0x7f67b6494026]
/lib64/libc.so.6(+0x78d53)[0x7f67b6494d53]
/opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+

Re: [OMPI devel] 2.0.0rc4 Crash in MPI_File_write_all_end

2016-07-25 Thread Eric Chamberland

Hi Edgard,

just to tell that I tested your fixe that has been merged into 
ompi-release/v2.x (9ba667815) and it works! :)


Thanks!

Eric

On 12/07/16 04:30 PM, Edgar Gabriel wrote:

I think the decision was made to postpone the patch to 2.0.1, since the
release of 2.0.0 is eminent.

Thanks
Edgar

On 7/12/2016 2:51 PM, Eric Chamberland wrote:

Hi Edgard,

I just saw that your patch got into ompi/master... any chances it goes
into ompi-release/v2.x before rc5?

thanks,

Eric


On 08/07/16 03:14 PM, Edgar Gabriel wrote:

I think I found the problem, I filed a pr towards master, and if that
passes I will file a pr for the 2.x branch.

Thanks!
Edgar


On 7/8/2016 1:14 PM, Eric Chamberland wrote:

On 08/07/16 01:44 PM, Edgar Gabriel wrote:

ok, but just to be able to construct a test case, basically what
you are
doing is

MPI_File_write_all_begin (fh, NULL, 0, some datatype);

MPI_File_write_all_end (fh, NULL, &status),

is this correct?

Yes, but with 2 processes:

rank 0 writes something, but not rank 1...

other info: rank 0 didn't wait for rank1 after
MPI_File_write_all_end so
it continued to the next MPI_File_write_all_begin with a different
datatype but on the same file...

thanks!

Eric
___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/07/19173.php

___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/07/19192.php