Re: [OMPI users] False positives with OpenMPI and memchecker (seems fixed between 3.0.0 and 3.0.1-rc1)

2018-01-06 Thread yvan . fournier
Hello,

Answering myself here: checking the revision history, commits
3b8b8c52c519f64cb3ff147db49fcac7cbd0e7d7 or 
66c9485e77f7da9a212ae67c88a21f95f13e6652 (in master) seem to relate to this, so 
I checked using the latest downloadable 3.0.x nightly release, and do not 
reproduce the issue anymore...

Sorry for the (too-late) report...

  Yvan

- Mail original -
From: "yvan fournier" 
To: users@lists.open-mpi.org
Sent: Sunday January 7 2018 01:52:04
Object: Re: False positives with OpenMPI and memchecker (with attachment)

Hello,

Sorry, I forgot the attached test case in my previous message... :(

Best regards,

  Yvan Fournier

- Mail transferred -
From: "yvan fournier" 
To: users@lists.open-mpi.org
Sent: Sunday January 7 2018 01:43:16
Object: False positives with OpenMPI and memchecker

Hello,

I obtain false positives with OpenMPI when memcheck is enabled, using OpenMPI 
3.0.0 

This is similar to an issue I had reported and had been fixed in Nov. 2016, but 
affects MPI_Isend/MPI_Irecv instead of MPI_Send/MPI_Recv.
I had not done much additional testing on my application using memchecker 
since, so probably may have missed remaining issues at the time.

In the attached test (which has 2 optional variants relating to whether the 
send and receive buffers are allocated on the stack or heap, but exhibit the 
same basic issue), I have (running "mpicc vg_ompi_isend_irecv.c && -g mpiexec 
-n 2 ./a.out"):

==19651== Memcheck, a memory error detector
==19651== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19651== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19651== Command: ./a.out
==19651== 
==19650== Thread 3:
==19650== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19650==at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19650==by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19650==by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19650==by 0x94C96ED: progress_engine (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19650==by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19650==by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19650==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19650== 
==19651== Thread 3:
==19651== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19651==at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19651==by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19651==by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19651==by 0x94C96ED: progress_engine (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19651==by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19651==by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19651==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19651== 
==19650== Thread 1:
==19650== Invalid read of size 2
==19650==at 0x4C33BA0: memmove (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19650==by 0x5A27C85: opal_convertor_pack (in 
/home/yvan/opt/openmpi-3.0/lib/libopen-pal.so.40.0.0)
==19650==by 0xD177EF1: mca_btl_vader_sendi (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_btl_vader.so)
==19650==by 0xE1A7F31: mca_pml_ob1_send_inline.constprop.4 (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650==by 0xE1A8711: mca_pml_ob1_isend (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650==by 0x4EB4C83: PMPI_Isend (in 
/home/yvan/opt/openmpi-3.0/lib/libmpi.so.40.0.0)
==19650==by 0x108B24: main (vg_ompi_isend_irecv.c:63)
==19650==  Address 0x1ffefffcc4 is on thread 1's stack
==19650==  in frame #6, created by main (vg_ompi_isend_irecv.c:7)

The first 2 warnings seem to relate to initialization, so are not a big issue, 
but the last one occurs whenever I use MPI_Isend, so they are a more important 
issue.

Using a version built without --enable-memchecker, I also have the two 
initialization warnings, but not the warning from MPI_Isend...

Best regards,

  Yvan Fournier


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] False positives with OpenMPI and memchecker

2018-01-06 Thread George Bosilca
Hi Yvan,

You mention a test. Can you make it available either on the mailing list, a
github issue or privately ?

  Thanks,
George.



On Sat, Jan 6, 2018 at 7:43 PM,  wrote:

>
> Hello,
>
> I obtain false positives with OpenMPI when memcheck is enabled, using
> OpenMPI 3.0.0
>
> This is similar to an issue I had reported and had been fixed in Nov.
> 2016, but affects MPI_Isend/MPI_Irecv instead of MPI_Send/MPI_Recv.
> I had not done much additional testing on my application using memchecker
> since, so probably may have missed remaining issues at the time.
>
> In the attached test (which has 2 optional variants relating to whether
> the send and receive buffers are allocated on the stack or heap, but
> exhibit the same basic issue), I have (running "mpicc vg_ompi_isend_irecv.c
> && -g mpiexec -n 2 ./a.out"):
>
> ==19651== Memcheck, a memory error detector
> ==19651== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
> ==19651== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright
> info
> ==19651== Command: ./a.out
> ==19651==
> ==19650== Thread 3:
> ==19650== Syscall param epoll_pwait(sigmask) points to unaddressable
> byte(s)
> ==19650==at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
> ==19650==by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
> ==19650==by 0x5A5EA9A: opal_libevent2022_event_base_loop
> (event.c:1630)
> ==19650==by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/
> lib/openmpi/mca_pmix_pmix2x.so)
> ==19650==by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
> ==19650==by 0x547042E: clone (in /usr/lib/libc-2.26.so)
> ==19650==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
> ==19650==
> ==19651== Thread 3:
> ==19651== Syscall param epoll_pwait(sigmask) points to unaddressable
> byte(s)
> ==19651==at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
> ==19651==by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
> ==19651==by 0x5A5EA9A: opal_libevent2022_event_base_loop
> (event.c:1630)
> ==19651==by 0x94C96ED: progress_engine (in /home/yvan/opt/openmpi-3.0/
> lib/openmpi/mca_pmix_pmix2x.so)
> ==19651==by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
> ==19651==by 0x547042E: clone (in /usr/lib/libc-2.26.so)
> ==19651==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
> ==19651==
> ==19650== Thread 1:
> ==19650== Invalid read of size 2
> ==19650==at 0x4C33BA0: memmove (in /usr/lib/valgrind/vgpreload_
> memcheck-amd64-linux.so)
> ==19650==by 0x5A27C85: opal_convertor_pack (in
> /home/yvan/opt/openmpi-3.0/lib/libopen-pal.so.40.0.0)
> ==19650==by 0xD177EF1: mca_btl_vader_sendi (in
> /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_btl_vader.so)
> ==19650==by 0xE1A7F31: mca_pml_ob1_send_inline.constprop.4 (in
> /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
> ==19650==by 0xE1A8711: mca_pml_ob1_isend (in
> /home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
> ==19650==by 0x4EB4C83: PMPI_Isend (in /home/yvan/opt/openmpi-3.0/
> lib/libmpi.so.40.0.0)
> ==19650==by 0x108B24: main (vg_ompi_isend_irecv.c:63)
> ==19650==  Address 0x1ffefffcc4 is on thread 1's stack
> ==19650==  in frame #6, created by main (vg_ompi_isend_irecv.c:7)
>
> The first 2 warnings seem to relate to initialization, so are not a big
> issue, but the last one occurs whenever I use MPI_Isend, so they are a more
> important issue.
>
> Using a version built without --enable-memchecker, I also have the two
> initialization warnings, but not the warning from MPI_Isend...
>
> Best regards,
>
>   Yvan Fournier
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] False positives with OpenMPI and memchecker (with attachment)

2018-01-06 Thread yvan . fournier
Hello,

Sorry, I forgot the attached test case in my previous message... :(

Best regards,

  Yvan Fournier

- Mail transferred -
From: "yvan fournier" 
To: users@lists.open-mpi.org
Sent: Sunday January 7 2018 01:43:16
Object: False positives with OpenMPI and memchecker

Hello,

I obtain false positives with OpenMPI when memcheck is enabled, using OpenMPI 
3.0.0 

This is similar to an issue I had reported and had been fixed in Nov. 2016, but 
affects MPI_Isend/MPI_Irecv instead of MPI_Send/MPI_Recv.
I had not done much additional testing on my application using memchecker 
since, so probably may have missed remaining issues at the time.

In the attached test (which has 2 optional variants relating to whether the 
send and receive buffers are allocated on the stack or heap, but exhibit the 
same basic issue), I have (running "mpicc vg_ompi_isend_irecv.c && -g mpiexec 
-n 2 ./a.out"):

==19651== Memcheck, a memory error detector
==19651== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==19651== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==19651== Command: ./a.out
==19651== 
==19650== Thread 3:
==19650== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19650==at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19650==by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19650==by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19650==by 0x94C96ED: progress_engine (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19650==by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19650==by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19650==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19650== 
==19651== Thread 3:
==19651== Syscall param epoll_pwait(sigmask) points to unaddressable byte(s)
==19651==at 0x5470596: epoll_pwait (in /usr/lib/libc-2.26.so)
==19651==by 0x5A5A9FA: epoll_dispatch (epoll.c:407)
==19651==by 0x5A5EA9A: opal_libevent2022_event_base_loop (event.c:1630)
==19651==by 0x94C96ED: progress_engine (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pmix_pmix2x.so)
==19651==by 0x5163089: start_thread (in /usr/lib/libpthread-2.26.so)
==19651==by 0x547042E: clone (in /usr/lib/libc-2.26.so)
==19651==  Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19651== 
==19650== Thread 1:
==19650== Invalid read of size 2
==19650==at 0x4C33BA0: memmove (in 
/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19650==by 0x5A27C85: opal_convertor_pack (in 
/home/yvan/opt/openmpi-3.0/lib/libopen-pal.so.40.0.0)
==19650==by 0xD177EF1: mca_btl_vader_sendi (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_btl_vader.so)
==19650==by 0xE1A7F31: mca_pml_ob1_send_inline.constprop.4 (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650==by 0xE1A8711: mca_pml_ob1_isend (in 
/home/yvan/opt/openmpi-3.0/lib/openmpi/mca_pml_ob1.so)
==19650==by 0x4EB4C83: PMPI_Isend (in 
/home/yvan/opt/openmpi-3.0/lib/libmpi.so.40.0.0)
==19650==by 0x108B24: main (vg_ompi_isend_irecv.c:63)
==19650==  Address 0x1ffefffcc4 is on thread 1's stack
==19650==  in frame #6, created by main (vg_ompi_isend_irecv.c:7)

The first 2 warnings seem to relate to initialization, so are not a big issue, 
but the last one occurs whenever I use MPI_Isend, so they are a more important 
issue.

Using a version built without --enable-memchecker, I also have the two 
initialization warnings, but not the warning from MPI_Isend...

Best regards,

  Yvan Fournier


#include 
#include 

#include 

int main(int argc, char *argv[])
{
  MPI_Request request[2];
  MPI_Status status[2];

  int l = 5, l_prev = 0;
  int rank_next = MPI_PROC_NULL, rank_prev = MPI_PROC_NULL;
  int rank_id = 0, n_ranks = 1, tag = 1;

  MPI_Init(, );

  MPI_Comm_rank(MPI_COMM_WORLD, _id);
  MPI_Comm_size(MPI_COMM_WORLD, _ranks);
  if (rank_id > 0)
rank_prev = rank_id -1;
  if (rank_id + 1 < n_ranks)
rank_next = rank_id + 1;

#if defined(VARIANT_1)

  int sendbuf[1] = {l};
  int recvbuf[1] = {0};

  if (rank_id %2 == 0) {
MPI_Isend(sendbuf, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD, &(request[0]));
MPI_Irecv(recvbuf, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &(request[1]));
  }
  else {
MPI_Irecv(recvbuf, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &(request[0]));
MPI_Isend(sendbuf, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD, &(request[1]));
  }
  MPI_Waitall(2, request, status);

  l_prev = recvbuf[0];

#elif defined(VARIANT_2)

  int *sendbuf = malloc(sizeof(int));
  int *recvbuf = malloc(sizeof(int));

  sendbuf[0] = l;

  if (rank_id %2 == 0) {
MPI_Isend(sendbuf, 1, MPI_INT, rank_next, tag, MPI_COMM_WORLD, &(request[0]));
MPI_Irecv(recvbuf, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &(request[1]));
  }
  else {
MPI_Irecv(recvbuf, 1, MPI_INT, rank_prev, tag, MPI_COMM_WORLD, &(request[0]));
MPI_Isend(sendbuf, 1, MPI_INT, rank_next,