In reviewing some of the PML and BTL MCA params with Gleb, we have
the following questions for the group:
1. btl_*_min_send_size is used to decide when to stop striping a
message across multiple BTL's. Is there a reason that we don't just
use eager_limit for this value? It seems weird to say "this message
is short enough to go across 1 BTL, even though it'll take multiple
sends if min_send_size > eager_limit". If no one has any objections,
we suggest eliminating this MCA parameter (!!) and corresponding
value and just using the BTL's eager limit for this value (this value
is set by every BTL, but only used in exactly 1 place in OB1).
Len: please put this on the agenda for next Tuesday (just so that
there's a deadline to ensure progress).
2. rdma_pipeline_offset is bad name; it is not an accurate
description of what this value represents. See the attached figure
for what this value is: it is the length that is sent/received after
the eager match before the RDMA (it happens to be at the end of the
message, but that's irrelevant). Specifically: it is a length, not
an offset. We should change this name. Here's some suggestions we
came up with:
rdma_pipeline_send_length (this is our favorite)
rdma_pipeline_send_recv_length
rdma_pipeline_send_receive_length
rdma_pipeline_total_send_length
If no one has any better suggestions, Gleb will change the name to
rdma_pipeline_send_length COB Thursday.
--
Jeff Squyres
Cisco Systems