Re: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under development

2022-08-03 Thread Nathan Hjelm via devel
Kind of sounds to me like they are using the wrong proc when receiving. Here is an example of what a modex receive should look like:https://github.com/open-mpi/ompi/blob/main/opal/mca/btl/ugni/btl_ugni_endpoint.c#L44-NathanOn Aug 3, 2022, at 11:29 AM, "Jeff Squyres (jsquyres) via devel"

Re: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under development

2022-08-03 Thread Jeff Squyres (jsquyres) via devel
Glad you solved the first issue! With respect to debugging, if you don't have a parallel debugger, you can do something like this: https://www.open-mpi.org/faq/?category=debugging#serial-debuggers If you haven't done so already, I highly suggest configuring Open MPI with "CFLAGS=-g -O0". As

Re: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under development

2022-08-03 Thread Michele Martinelli via devel
thank you for the answer. Actually I think I solved that problem some days ago, basically (if I correctly understand) MPI "adds" in some sense an header to the data sent (please correct me if I'm wrong), which is then used by ob1 to match the data arrived with the mpi_recv posted by the user.

Re: [OMPI devel] How to progress MPI_Recv using custom BTL for NIC under development

2022-08-03 Thread Jeff Squyres (jsquyres) via devel
Sorry for the huge delay in replies -- it's summer / vacation season, and I think we (as a community) are a little behind in answering some of these emails. :-( It's been quite a while since I have been in the depths of BTL internals; I'm afraid I don't remember the details offhand. When I

Re: [OMPI devel] Rationale behind memcpy chunk size (in smsc/xpmem)

2022-08-03 Thread Jeff Squyres (jsquyres) via devel
Sorry for the delay in replies -- it's summer / vacation season, and I think we (as a community) are a little behind in answering some of these emails. :-( It's hard to say for any given machine, but a bunch of different hardware factors can come into play, such as: - L1, L2, L3 cache sizes -