[OMPI devel] 2.0.0rc4 Crash in MPI_File_write_all_end

2016-07-08 Thread Eric Chamberland

Hi,

I am testing for the first time the 2.X release candidate.

I have a segmentation violation using  MPI_File_write_all_end(MPI_File 
fh, const void *buf, MPI_Status *status)


The "special" thing, may be that in the faulty test cases, there are 
processes that haven't written anything, so they a a zero length buffer, 
so the second parameter (buf) passed is a null pointer.


Until now, it was a valid call, has it changed?

Thanks,

Eric

FWIW: We are using our test suite (~2000 nightly tests) successfully 
with openmpi-1.{6,8,10}.* and MPICH since many years...


Re: [OMPI devel] 2.0.0rc4 Crash in MPI_File_write_all_end

2016-07-08 Thread Edgar Gabriel
The default MPI I/O library has changed in the 2.x release to OMPIO for 
most file systems. I can look into that problem, any chance to get 
access to the testsuite that you mentioned?


Thanks
Edgar


On 7/8/2016 11:32 AM, Eric Chamberland wrote:

Hi,

I am testing for the first time the 2.X release candidate.

I have a segmentation violation using  MPI_File_write_all_end(MPI_File
fh, const void *buf, MPI_Status *status)

The "special" thing, may be that in the faulty test cases, there are
processes that haven't written anything, so they a a zero length buffer,
so the second parameter (buf) passed is a null pointer.

Until now, it was a valid call, has it changed?

Thanks,

Eric

FWIW: We are using our test suite (~2000 nightly tests) successfully
with openmpi-1.{6,8,10}.* and MPICH since many years...
___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/07/19169.php


--
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335
--



Re: [OMPI devel] 2.0.0rc4 Crash in MPI_File_write_all_end

2016-07-08 Thread Eric Chamberland

Hi,

On 08/07/16 12:52 PM, Edgar Gabriel wrote:

The default MPI I/O library has changed in the 2.x release to OMPIO for


ok, I am now doing I/O on my own hard drive... but I can test over NFS 
easily.  For Lustre, I will have to produce a reduced example out of our 
test suite...



most file systems. I can look into that problem, any chance to get
access to the testsuite that you mentioned?


Yikes! Sounds interesting, but difficult to realize...  Our in-house 
code is not public... :/


I however proposed (to myself) to add a nightly compilation of openmpi 
(see http://www.open-mpi.org/community/lists/users/2016/06/29515.php) so 
I can report problems before releases are made...


Anyway, I will work on the little script to automate the 
MPI+PETSc+InHouseCode combination so I get give you a feedback as soon 
as you will propose me to test a patch...


Hoping this will be enough convenient for you...

Thanks!

Eric



Thanks
Edgar


On 7/8/2016 11:32 AM, Eric Chamberland wrote:

Hi,

I am testing for the first time the 2.X release candidate.

I have a segmentation violation using  MPI_File_write_all_end(MPI_File
fh, const void *buf, MPI_Status *status)

The "special" thing, may be that in the faulty test cases, there are
processes that haven't written anything, so they a a zero length buffer,
so the second parameter (buf) passed is a null pointer.

Until now, it was a valid call, has it changed?

Thanks,

Eric

FWIW: We are using our test suite (~2000 nightly tests) successfully
with openmpi-1.{6,8,10}.* and MPICH since many years...
___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/07/19169.php




Re: [OMPI devel] 2.0.0rc4 Crash in MPI_File_write_all_end

2016-07-08 Thread Edgar Gabriel
ok, but just to be able to construct a test case, basically what you are 
doing is


MPI_File_write_all_begin (fh, NULL, 0, some datatype);

MPI_File_write_all_end (fh, NULL, &status),

is this correct?

Thanks

Edgar


On 7/8/2016 12:19 PM, Eric Chamberland wrote:

Hi,

On 08/07/16 12:52 PM, Edgar Gabriel wrote:

The default MPI I/O library has changed in the 2.x release to OMPIO for

ok, I am now doing I/O on my own hard drive... but I can test over NFS
easily.  For Lustre, I will have to produce a reduced example out of our
test suite...


most file systems. I can look into that problem, any chance to get
access to the testsuite that you mentioned?

Yikes! Sounds interesting, but difficult to realize...  Our in-house
code is not public... :/

I however proposed (to myself) to add a nightly compilation of openmpi
(see http://www.open-mpi.org/community/lists/users/2016/06/29515.php) so
I can report problems before releases are made...

Anyway, I will work on the little script to automate the
MPI+PETSc+InHouseCode combination so I get give you a feedback as soon
as you will propose me to test a patch...

Hoping this will be enough convenient for you...

Thanks!

Eric


Thanks
Edgar


On 7/8/2016 11:32 AM, Eric Chamberland wrote:

Hi,

I am testing for the first time the 2.X release candidate.

I have a segmentation violation using  MPI_File_write_all_end(MPI_File
fh, const void *buf, MPI_Status *status)

The "special" thing, may be that in the faulty test cases, there are
processes that haven't written anything, so they a a zero length buffer,
so the second parameter (buf) passed is a null pointer.

Until now, it was a valid call, has it changed?

Thanks,

Eric

FWIW: We are using our test suite (~2000 nightly tests) successfully
with openmpi-1.{6,8,10}.* and MPICH since many years...
___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post:
http://www.open-mpi.org/community/lists/devel/2016/07/19169.php

___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/07/19171.php


--
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335
--



Re: [OMPI devel] 2.0.0rc4 Crash in MPI_File_write_all_end

2016-07-08 Thread Eric Chamberland



On 08/07/16 01:44 PM, Edgar Gabriel wrote:

ok, but just to be able to construct a test case, basically what you are
doing is

MPI_File_write_all_begin (fh, NULL, 0, some datatype);

MPI_File_write_all_end (fh, NULL, &status),

is this correct?


Yes, but with 2 processes:

rank 0 writes something, but not rank 1...

other info: rank 0 didn't wait for rank1 after MPI_File_write_all_end so 
it continued to the next MPI_File_write_all_begin with a different 
datatype but on the same file...


thanks!

Eric


Re: [OMPI devel] 2.0.0rc4 Crash in MPI_File_write_all_end

2016-07-08 Thread Edgar Gabriel
I think I found the problem, I filed a pr towards master, and if that 
passes I will file a pr for the 2.x branch.


Thanks!
Edgar


On 7/8/2016 1:14 PM, Eric Chamberland wrote:


On 08/07/16 01:44 PM, Edgar Gabriel wrote:

ok, but just to be able to construct a test case, basically what you are
doing is

MPI_File_write_all_begin (fh, NULL, 0, some datatype);

MPI_File_write_all_end (fh, NULL, &status),

is this correct?

Yes, but with 2 processes:

rank 0 writes something, but not rank 1...

other info: rank 0 didn't wait for rank1 after MPI_File_write_all_end so
it continued to the next MPI_File_write_all_begin with a different
datatype but on the same file...

thanks!

Eric
___
devel mailing list
de...@open-mpi.org
Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel
Link to this post: 
http://www.open-mpi.org/community/lists/devel/2016/07/19173.php


--
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335
--