Hello Ralph and all Please ignore this mail. It is indeed due to a syntax error in my code. Sorry for the noise; I'll be more careful with my homework from now on.
Best regards Durga We learn from history that we never learn from history. On Mon, May 23, 2016 at 2:13 AM, dpchoudh . <dpcho...@gmail.com> wrote: > Hello Ralph > > Thanks for your input. The routine that does the send is this: > > static int btl_lf_modex_send(lfgroup lfgroup) > { > char *grp_name = lf_get_group_name(lfgroup, NULL, 0); > btl_lf_modex_t lf_modex; > int rc; > strncpy(lf_modex.grp_name, grp_name, GRP_NAME_MAX_LEN); > OPAL_MODEX_SEND(rc, OPAL_PMIX_GLOBAL, > &mca_btl_lf_component.super.btl_version, > (char *)&lf_modex, sizeof(lf_modex)); > return rc; > } > > This routine is called from the component init routine > (mca_btl_lf_component_init()). I have verified that the values in the modex > (lf_modex) are correct. > > The receive happens in proc_create, and I call it like this: > OPAL_MODEX_RECV(rc, &mca_btl_lf_component.super.btl_version, > &opal_proc->proc_name, (uint8_t > **)&module_proc->proc_modex, &size); > > In here, I get junk value in proc_modex. If I pass a buffer that was > malloc()'ed in place of module_proc->proc_modex, I still get bad data. > > > Thanks again for your help. > > Durga > > We learn from history that we never learn from history. > > On Sat, May 21, 2016 at 8:38 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> Please provide the exact code used for both send/recv - you likely have >> an error in the syntax >> >> >> On May 20, 2016, at 9:36 PM, dpchoudh . <dpcho...@gmail.com> wrote: >> >> Hello all >> >> I have a naive question: >> >> My 'cluster' consists of two nodes, connected back to back with a >> proprietary link as well as GbE (over a switch). >> I am calling OPAL_MODEX_SEND() and the modex consists of just this: >> >> struct modex >> {char name[20], unsigned mtu}; >> >> The mtu field is not currently being used. I bzero() the struct and have >> verified that the value being written to the 'name' field (this is similar >> to a PKEY for infiniband; the driver will translate this to a unique >> integer) is correct at the sending end. >> >> When I do a OPAL_MODEX_RECV(), the value is completely corrupted. >> However, the size of the modex message is still correct (24 bytes) >> What could I be doing wrong? (Both nodes are little endian x86_64 >> machines) >> >> Thanks in advance >> Durga >> >> We learn from history that we never learn from history. >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/05/19012.php >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/devel >> Link to this post: >> http://www.open-mpi.org/community/lists/devel/2016/05/19019.php >> > >