Can someone tell me about mca_btl_sm_sendi()? In btl_sm.c, I see that
it's commented out of the structure "mca_btl_sm".
It seems to me that pingpong latency over shared memory in OMPI isn't as
fast as certain "competitors". If I put mca_btl_sm_sendi back in, it
seems to improve the pingpong latency a little. For some pingpong test
and some hardware and some compiler, etc., etc., I get 953 nsec out of
the box and 902 nsec if I use mca_btl_sm_sendi. So, it seems to improve
latency.
Why is it commented out? E.g., look at btl_sm.c and search for the
first occurrence of "sendi":
mca_btl_sm_t mca_btl_sm = {
{
&mca_btl_sm_component.super,
...
NULL /*mca_btl_sm_sendi*/, /* send immediate */
...
}
};
(I'm just about to leave for a week, but I look forward to reading
everyone's insightful comments and lively discussion upon my return.)