Although --enable-mem-debug resolves the issue, I get warnings about uninitialized bytes in writev from the opal_if_t structs in opal_ifinit:

==25777== Syscall param writev(vector[...]) points to uninitialised byte(s)
==25777==    at 0x34DE2C9F0C: writev (in /lib64/libc-2.6.so)
==25777==    by 0xAC233B: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:265)
==25777==    by 0xABAC92: mca_oob_tcp_peer_send (oob_tcp_peer.c:197)
==25777==    by 0xAC0A80: mca_oob_tcp_send_nb (oob_tcp_send.c:167)
==25777==    by 0xAD025E: orte_rml_oob_send (rml_oob_send.c:137)
==25777==    by 0xAD0CE3: orte_rml_oob_send_buffer (rml_oob_send.c:269)
==25777==    by 0xAA50A6: allgather (grpcomm_bad_module.c:370)
==25777==    by 0xAA592D: modex (grpcomm_bad_module.c:498)
==25777==    by 0x92EE48: ompi_mpi_init (ompi_mpi_init.c:626)
==25777==    by 0x95351C: PMPI_Init (pinit.c:80)

Since this isn't a performance critical part of Open MPI, why not follow the reasoning already noted in a comment at opal/util/if.c:208 and zero-out the struct even outside OMPI_ENABLE_MEM_DEBUG.

The attached patch makes this one-line change and clears up all valgrind warnings (when --with-valgrind enabled).

Regards,
Simon

diff -r -U 5 openmpi-1.3.2/opal/util/if.c openmpi-1.3.2.edited/opal/util/if.c
--- openmpi-1.3.2/opal/util/if.c	2009-04-16 20:02:42.000000000 +0100
+++ openmpi-1.3.2.edited/opal/util/if.c	2009-04-23 16:18:09.000000000 +0100
@@ -258,11 +258,12 @@
         struct ifreq* ifr = (struct ifreq*) ptr;
         opal_if_t intf;
         opal_if_t *intf_ptr;
         int length;

-        OMPI_DEBUG_ZERO(intf);
+        /* Again, make valgrind and purify happy - this isn't performance critical. */
+        memset(&intf, 0, sizeof(intf));
         OBJ_CONSTRUCT(&intf, opal_list_item_t);

         /* compute offset for entries */
 #ifdef HAVE_STRUCT_SOCKADDR_SA_LEN
         length = sizeof(struct sockaddr);

Reply via email to