Here is my explanation.  The call to MCA_BTL_TCP_FRAG_ALLOC_EAGER or 
MCA_BTL_TCP_FRAG_ALLOC_MAX allocate a chunk of memory that has space for both 
the fragment as well as any payload.  So, when we do the frag+1, we are setting 
the pointer in the frag to point where the payload of the message lives.  This 
payload contains the PML header information and potentially the user's buffer.  
 So, that allocation is actually returning something like 64K for the eager 
allocation and 128K for the max allocation.  If you look at btl_tcp_component.c 
in the function mca_btl_tcp_component_init() you can see where the eager and 
max free lists are initialized.

In the case of TCP, there are two segments.  The first segment will contain the 
PML header information.   If the buffer being sent (or received) is contiguous, 
then the rest of the space allocated is not used.  Rather, the second segment 
will point to the user's buffer as there is no need to first copy it into a 
buffer.  If the buffer being sent (or received) is non-contiguous, then the 
data is first copied into the allocated space as it needs to be packed.

Does that make sense?


>-----Original Message-----
>From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org]
>On Behalf Of Alex Margolin
>Sent: Wednesday, April 04, 2012 9:23 AM
>To: Open MPI Developers
>Subject: [OMPI devel] mca_btl_tcp_alloc
>
>Hi,
>
>As I'm working out the bugs in my component I used TCP as reference and
>came across the following:
>In mca_btl_tcp_alloc (openmpi-trunk/ompi/mca/btl/tcp/btl_tcp.c:188) the
>first segment is initialized to point to "frag + 1".
>I don't get it... how/when is this location allocated? Isn't it just after the
>mca_btl_tcp_frag_t structure ends?
>
>Thanks,
>Alex
>
>mca_btl_base_descriptor_t* mca_btl_tcp_alloc(
>     struct mca_btl_base_module_t* btl,
>     struct mca_btl_base_endpoint_t* endpoint,
>     uint8_t order,
>     size_t size,
>     uint32_t flags)
>{
>     mca_btl_tcp_frag_t* frag = NULL;
>     int rc;
>
>     if(size <= btl->btl_eager_limit) {
>         MCA_BTL_TCP_FRAG_ALLOC_EAGER(frag, rc);
>     } else if (size <= btl->btl_max_send_size) {
>         MCA_BTL_TCP_FRAG_ALLOC_MAX(frag, rc);
>     }
>     if( OPAL_UNLIKELY(NULL == frag) ) {
>         return NULL;
>     }
>
>     frag->segments[0].seg_len = size;
>     frag->segments[0].seg_addr.pval = frag+1;
>
>     frag->base.des_src = frag->segments;
>     frag->base.des_src_cnt = 1;
>     frag->base.des_dst = NULL;
>     frag->base.des_dst_cnt = 0;
>     frag->base.des_flags = flags;
>     frag->base.order = MCA_BTL_NO_ORDER;
>     frag->btl = (mca_btl_tcp_module_t*)btl;
>     return (mca_btl_base_descriptor_t*)frag; }
>_______________________________________________
>devel mailing list
>de...@open-mpi.org
>http://www.open-mpi.org/mailman/listinfo.cgi/devel
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

Reply via email to