Although --enable-mem-debug resolves the issue, I get warnings about
uninitialized bytes in writev from the opal_if_t structs in opal_ifinit:
==25777== Syscall param writev(vector[...]) points to uninitialised byte(s)
==25777== at 0x34DE2C9F0C: writev (in /lib64/libc-2.6.so)
==25777== by 0xAC233B: mca_oob_tcp_msg_send_handler (oob_tcp_msg.c:265)
==25777== by 0xABAC92: mca_oob_tcp_peer_send (oob_tcp_peer.c:197)
==25777== by 0xAC0A80: mca_oob_tcp_send_nb (oob_tcp_send.c:167)
==25777== by 0xAD025E: orte_rml_oob_send (rml_oob_send.c:137)
==25777== by 0xAD0CE3: orte_rml_oob_send_buffer (rml_oob_send.c:269)
==25777== by 0xAA50A6: allgather (grpcomm_bad_module.c:370)
==25777== by 0xAA592D: modex (grpcomm_bad_module.c:498)
==25777== by 0x92EE48: ompi_mpi_init (ompi_mpi_init.c:626)
==25777== by 0x95351C: PMPI_Init (pinit.c:80)
Since this isn't a performance critical part of Open MPI, why not follow
the reasoning already noted in a comment at opal/util/if.c:208 and
zero-out the struct even outside OMPI_ENABLE_MEM_DEBUG.
The attached patch makes this one-line change and clears up all valgrind
warnings (when --with-valgrind enabled).
Regards,
Simon
diff -r -U 5 openmpi-1.3.2/opal/util/if.c openmpi-1.3.2.edited/opal/util/if.c
--- openmpi-1.3.2/opal/util/if.c 2009-04-16 20:02:42.000000000 +0100
+++ openmpi-1.3.2.edited/opal/util/if.c 2009-04-23 16:18:09.000000000 +0100
@@ -258,11 +258,12 @@
struct ifreq* ifr = (struct ifreq*) ptr;
opal_if_t intf;
opal_if_t *intf_ptr;
int length;
- OMPI_DEBUG_ZERO(intf);
+ /* Again, make valgrind and purify happy - this isn't performance critical. */
+ memset(&intf, 0, sizeof(intf));
OBJ_CONSTRUCT(&intf, opal_list_item_t);
/* compute offset for entries */
#ifdef HAVE_STRUCT_SOCKADDR_SA_LEN
length = sizeof(struct sockaddr);