I took the liberty of GK ratcheting that CMR through, in the interest of 
expediency...

On Dec 14, 2011, at 8:15 AM, Shiqing Fan wrote:

> I see the real problem now, the .windows file is not added into the tarball.
> 
> On 2011-12-14 1:48 PM, George Bosilca wrote:
>> Shiqing,
>> 
>> This file seems to be there.
>> 
>> $ pwd
>> /home/bosilca/unstable/1.5/ompi
>> 
>> $ svn info opal/mca/shmem/windows/.windows
>> Path: opal/mca/shmem/windows/.windows
>> Name: .windows
>> URL: 
>> https://svn.open-mpi.org/svn/ompi/branches/v1.5/opal/mca/shmem/windows/.windows
>> Repository Root: https://svn.open-mpi.org/svn/ompi
>> Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe
>> Revision: 25637
>> Node Kind: file
>> Schedule: normal
>> Last Changed Author: bosilca
>> Last Changed Rev: 25626
>> Last Changed Date: 2011-12-13 12:20:25 -0500 (Tue, 13 Dec 2011)
>> Text Last Updated: 2011-12-13 12:20:35 -0500 (Tue, 13 Dec 2011)
>> Checksum: ebb6f0135ecdcf7f79d1120046dfb3e6
>> 
>> george.
>> 
>> On Dec 14, 2011, at 05:36 , Shiqing Fan wrote:
>> 
>>> Hi George,
>>> 
>>> A .windows file seems still missing in opal/mca/shmem/windows/. Could you 
>>> also svn add it (from the patch in shmem ticket)?
>>> 
>>> It is not a source file, but rather a CMake required configuration file. 
>>> Probably this change doesn't need another rc. :-) Thanks a lot.
>>> 
>>> 
>>> Regards,
>>> Shiqing
>>> 
>>> On 2011-12-13 10:30 PM, bosi...@osl.iu.edu wrote:
>>>> Author: bosilca
>>>> Date: 2011-12-13 16:30:53 EST (Tue, 13 Dec 2011)
>>>> New Revision: 25627
>>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/25627
>>>> 
>>>> Log:
>>>> Add and remove some of the files needed for the shmem patch.
>>>> 
>>>> Added:
>>>>    branches/v1.5/ompi/mca/common/sm/common_sm.c
>>>>    branches/v1.5/ompi/mca/common/sm/common_sm.h
>>>>    branches/v1.5/ompi/mca/common/sm/common_sm_rml.c
>>>>    branches/v1.5/ompi/mca/common/sm/common_sm_rml.h
>>>> Removed:
>>>>    branches/v1.5/ompi/mca/common/sm/common_sm_mmap.c
>>>>    branches/v1.5/ompi/mca/common/sm/common_sm_mmap.h
>>>> 
>>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm.c
>>>> ==============================================================================
>>>> --- (empty file)
>>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm.c   2011-12-13 16:30:53 EST 
>>>> (Tue, 13 Dec 2011)
>>>> @@ -0,0 +1,387 @@
>>>> +/*
>>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>>> + *                         University Research and Technology
>>>> + *                         Corporation.  All rights reserved.
>>>> + * Copyright (c) 2004-2005 The University of Tennessee and The University
>>>> + *                         of Tennessee Research Foundation.  All rights
>>>> + *                         reserved.
>>>> + * Copyright (c) 2004-2009 High Performance Computing Center Stuttgart,
>>>> + *                         University of Stuttgart.  All rights reserved.
>>>> + * Copyright (c) 2004-2005 The Regents of the University of California.
>>>> + *                         All rights reserved.
>>>> + * Copyright (c) 2007      Sun Microsystems, Inc.  All rights reserved.
>>>> + * Copyright (c) 2008-2010 Cisco Systems, Inc.  All rights reserved.
>>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC.
>>>> + *                         All rights reserved.
>>>> + * $COPYRIGHT$
>>>> + *
>>>> + * Additional copyrights may follow
>>>> + *
>>>> + * $HEADER$
>>>> + */
>>>> +
>>>> +#include "ompi_config.h"
>>>> +
>>>> +#ifdef HAVE_STRING_H
>>>> +#include<string.h>
>>>> +#endif
>>>> +
>>>> +#include "opal/align.h"
>>>> +#include "opal/util/argv.h"
>>>> +#if OPAL_ENABLE_FT_CR == 1
>>>> +#include "opal/runtime/opal_cr.h"
>>>> +#endif
>>>> +
>>>> +#include "orte/util/name_fns.h"
>>>> +#include "orte/util/show_help.h"
>>>> +#include "orte/runtime/orte_globals.h"
>>>> +#include "orte/mca/errmgr/errmgr.h"
>>>> +
>>>> +#include "ompi/constants.h"
>>>> +#include "ompi/mca/dpm/dpm.h"
>>>> +#include "ompi/mca/mpool/sm/mpool_sm.h"
>>>> +
>>>> +#include "common_sm_rml.h"
>>>> +
>>>> +/* ASSUMING local process homogeneity with respect to all utilized shared 
>>>> memory
>>>> + * facilities. that is, if one local process deems a particular shared 
>>>> memory
>>>> + * facility acceptable, then ALL local processes should be able to 
>>>> utilize that
>>>> + * facility. as it stands, this is an important point because one process
>>>> + * dictates to all other local processes which common sm component will be
>>>> + * selected based on its own, local run-time test.
>>>> + */
>>>> +
>>>> +OBJ_CLASS_INSTANCE(
>>>> +    mca_common_sm_module_t,
>>>> +    opal_object_t,
>>>> +    NULL,
>>>> +    NULL
>>>> +);
>>>> +
>>>> +/* list of RML messages that have arrived that have not yet been
>>>> + * consumed by the thread who is looking to complete its component
>>>> + * initialization based on the contents of the RML message.
>>>> + */
>>>> +static opal_list_t pending_rml_msgs;
>>>> +/* flag indicating whether or not pending_rml_msgs has been initialized */
>>>> +static bool pending_rml_msgs_init = false;
>>>> +/* lock to protect multiple instances of mca_common_sm_init() from being
>>>> + * invoked simultaneously (because of RML usage).
>>>> + */
>>>> +static opal_mutex_t mutex;
>>>> +/* shared memory information used for initialization and setup. */
>>>> +static opal_shmem_ds_t shmem_ds;
>>>> +/* number of local processes */
>>>> +static size_t num_local_procs = 0;
>>>> +/* indicates whether or not i'm the lowest named process */
>>>> +static bool lowest_local_proc = false;
>>>> +
>>>> +/* 
>>>> ////////////////////////////////////////////////////////////////////////// 
>>>> */
>>>> +/* static utility functions */
>>>> +/* 
>>>> ////////////////////////////////////////////////////////////////////////// 
>>>> */
>>>> +
>>>> +/* 
>>>> ////////////////////////////////////////////////////////////////////////// 
>>>> */
>>>> +static mca_common_sm_module_t *
>>>> +attach_and_init(const char *file_name,
>>>> +                size_t size_ctl_structure,
>>>> +                size_t data_seg_alignment)
>>>> +{
>>>> +    mca_common_sm_module_t *map = NULL;
>>>> +    mca_common_sm_seg_header_t *seg = NULL;
>>>> +    unsigned char *addr = NULL;
>>>> +
>>>> +    /* map the file and initialize segment state */
>>>> +    if (NULL == (seg = (mca_common_sm_seg_header_t *)
>>>> +                       opal_shmem_segment_attach(&shmem_ds))) {
>>>> +        return NULL;
>>>> +    }
>>>> +    opal_atomic_rmb();
>>>> +
>>>> +    /* set up the map object */
>>>> +    if (NULL == (map = OBJ_NEW(mca_common_sm_module_t))) {
>>>> +        ORTE_ERROR_LOG(OMPI_ERR_OUT_OF_RESOURCE);
>>>> +        return NULL;
>>>> +    }
>>>> +
>>>> +    /* copy information: from ====>   to */
>>>> +    opal_shmem_ds_copy(&shmem_ds,&map->shmem_ds);
>>>> +
>>>> +    /* the first entry in the file is the control structure. the first
>>>> +     * entry in the control structure is an mca_common_sm_seg_header_t
>>>> +     * element
>>>> +     */
>>>> +    map->module_seg = seg;
>>>> +
>>>> +    addr = ((unsigned char *)seg) + size_ctl_structure;
>>>> +    /* if we have a data segment (i.e., if 0 != data_seg_alignment),
>>>> +     * then make it the first aligned address after the control
>>>> +     * structure.  IF THIS HAPPENS, THIS IS A PROGRAMMING ERROR IN
>>>> +     * OPEN MPI!
>>>> +     */
>>>> +    if (0 != data_seg_alignment) {
>>>> +        addr = OPAL_ALIGN_PTR(addr, data_seg_alignment, unsigned char *);
>>>> +        /* is addr past end of the shared memory segment? */
>>>> +        if ((unsigned char *)seg + shmem_ds.seg_size<   addr) {
>>>> +            orte_show_help("help-mpi-common-sm.txt", "mmap too small", 1,
>>>> +                           orte_process_info.nodename,
>>>> +                           (unsigned long)shmem_ds.seg_size,
>>>> +                           (unsigned long)size_ctl_structure,
>>>> +                           (unsigned long)data_seg_alignment);
>>>> +            return NULL;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    map->module_data_addr = addr;
>>>> +    map->module_seg_addr = (unsigned char *)seg;
>>>> +
>>>> +    /* map object successfully initialized - we can safely increment
>>>> +     * seg_num_procs_attached_and_inited. this value is used by
>>>> +     * opal_shmem_unlink.
>>>> +     */
>>>> +    (void)opal_atomic_add_size_t(&map->module_seg->seg_num_procs_inited, 
>>>> 1);
>>>> +    opal_atomic_wmb();
>>>> +
>>>> +    return map;
>>>> +}
>>>> +
>>>> +/* 
>>>> ////////////////////////////////////////////////////////////////////////// 
>>>> */
>>>> +mca_common_sm_module_t *
>>>> +mca_common_sm_init(ompi_proc_t **procs,
>>>> +                   size_t num_procs,
>>>> +                   size_t size,
>>>> +                   char *file_name,
>>>> +                   size_t size_ctl_structure,
>>>> +                   size_t data_seg_alignment)
>>>> +{
>>>> +    mca_common_sm_module_t *map = NULL;
>>>> +    bool found_lowest = false;
>>>> +    size_t p;
>>>> +    size_t mem_offset;
>>>> +    ompi_proc_t *temp_proc;
>>>> +
>>>> +    num_local_procs = 0;
>>>> +    lowest_local_proc = false;
>>>> +
>>>> +    /* o reorder procs array to have all the local procs at the beginning.
>>>> +     * o look for the local proc with the lowest name.
>>>> +     * o determine the number of local procs.
>>>> +     * o ensure that procs[0] is the lowest named process.
>>>> +     */
>>>> +    for (p = 0; p<   num_procs; ++p) {
>>>> +        if (OPAL_PROC_ON_LOCAL_NODE(procs[p]->proc_flags)) {
>>>> +            /* if we don't have a lowest, save the first one */
>>>> +            if (!found_lowest) {
>>>> +                procs[0] = procs[p];
>>>> +                found_lowest = true;
>>>> +            }
>>>> +            else {
>>>> +                /* save this proc */
>>>> +                procs[num_local_procs] = procs[p];
>>>> +                /* if we have a new lowest, swap it with position 0
>>>> +                 * so that procs[0] is always the lowest named proc
>>>> +                 */
>>>> +                if (orte_util_compare_name_fields(ORTE_NS_CMP_ALL,
>>>> +&(procs[p]->proc_name),
>>>> +&(procs[0]->proc_name))<   0) {
>>>> +                    temp_proc = procs[0];
>>>> +                    procs[0] = procs[p];
>>>> +                    procs[num_local_procs] = temp_proc;
>>>> +                }
>>>> +            }
>>>> +            /* regardless of the comparisons above, we found
>>>> +             * another proc on the local node, so increment
>>>> +             */
>>>> +            ++num_local_procs;
>>>> +        }
>>>> +    }
>>>> +
>>>> +    /* if there is less than 2 local processes, there's nothing to do. */
>>>> +    if (num_local_procs<   2) {
>>>> +        return NULL;
>>>> +    }
>>>> +
>>>> +    /* determine whether or not i am the lowest local process */
>>>> +    lowest_local_proc = (0 == orte_util_compare_name_fields(
>>>> +                                  ORTE_NS_CMP_ALL,
>>>> +                                  ORTE_PROC_MY_NAME,
>>>> +&(procs[0]->proc_name)));
>>>> +
>>>> +    /* lock here to prevent multiple threads from invoking this
>>>> +     * function simultaneously.  the critical section we're protecting
>>>> +     * is usage of the RML in this block.
>>>> +     */
>>>> +    opal_mutex_lock(&mutex);
>>>> +
>>>> +    if (!pending_rml_msgs_init) {
>>>> +        OBJ_CONSTRUCT(&(pending_rml_msgs), opal_list_t);
>>>> +        pending_rml_msgs_init = true;
>>>> +    }
>>>> +    /* figure out if i am the lowest rank in the group.
>>>> +     * if so, i will create the shared memory backing store
>>>> +     */
>>>> +    if (lowest_local_proc) {
>>>> +        if (OPAL_SUCCESS == opal_shmem_segment_create(&shmem_ds, 
>>>> file_name,
>>>> +                                                      size)) {
>>>> +            map = attach_and_init(file_name, size_ctl_structure,
>>>> +                                  data_seg_alignment);
>>>> +            if (NULL != map) {
>>>> +                mem_offset = map->module_data_addr -
>>>> +                             (unsigned char *)map->module_seg;
>>>> +                map->module_seg->seg_offset = mem_offset;
>>>> +                map->module_seg->seg_size = size - mem_offset;
>>>> +                opal_atomic_init(&map->module_seg->seg_lock,
>>>> +                                 OPAL_ATOMIC_UNLOCKED);
>>>> +                map->module_seg->seg_inited = 0;
>>>> +            }
>>>> +            else {
>>>> +                /* fail!
>>>> +                 * only invalidate the shmem_ds.  doing so will let the 
>>>> rest
>>>> +                 * of the local processes know that the lowest local rank
>>>> +                 * failed to properly initialize the shared memory 
>>>> segment, so
>>>> +                 * they should try to carry on without shared memory 
>>>> support
>>>> +                 */
>>>> +                 OPAL_SHMEM_DS_INVALIDATE(&shmem_ds);
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>> +    /* send shmem info to the rest of the local procs. */
>>>> +    if (OMPI_SUCCESS != mca_common_sm_rml_info_bcast(
>>>> +&shmem_ds, procs, num_local_procs,
>>>> +                            OMPI_RML_TAG_SM_BACK_FILE_CREATED,
>>>> +                            lowest_local_proc, file_name,
>>>> +&(pending_rml_msgs))) {
>>>> +        goto out;
>>>> +    }
>>>> +
>>>> +    /* are we dealing with a valid shmem_ds?  that is, did the lowest
>>>> +     * process successfully initialize the shared memory segment?
>>>> +     */
>>>> +    if (OPAL_SHMEM_DS_IS_VALID(&shmem_ds)) {
>>>> +        if (!lowest_local_proc) {
>>>> +            map = attach_and_init(file_name, size_ctl_structure,
>>>> +                                  data_seg_alignment);
>>>> +        }
>>>> +        else {
>>>> +            /* wait until every other participating process has attached 
>>>> to the
>>>> +             * shared memory segment.
>>>> +             */
>>>> +            while (num_local_procs>   
>>>> map->module_seg->seg_num_procs_inited) {
>>>> +                opal_atomic_rmb();
>>>> +            }
>>>> +            opal_shmem_unlink(&shmem_ds);
>>>> +        }
>>>> +    }
>>>> +
>>>> +out:
>>>> +    opal_mutex_unlock(&mutex);
>>>> +    return map;
>>>> +}
>>>> +
>>>> +/* 
>>>> ////////////////////////////////////////////////////////////////////////// 
>>>> */
>>>> +/**
>>>> + * this routine is the same as mca_common_sm_mmap_init() except that
>>>> + * it takes an (ompi_group_t *) parameter to specify the peers rather
>>>> + * than an array of procs.  unlike mca_common_sm_mmap_init(), the
>>>> + * group must contain *only* local peers, or this function will return
>>>> + * NULL and not create any shared memory segment.
>>>> + */
>>>> +mca_common_sm_module_t *
>>>> +mca_common_sm_init_group(ompi_group_t *group,
>>>> +                         size_t size,
>>>> +                         char *file_name,
>>>> +                         size_t size_ctl_structure,
>>>> +                         size_t data_seg_alignment)
>>>> +{
>>>> +    mca_common_sm_module_t *ret = NULL;
>>>> +    ompi_proc_t **procs = NULL;
>>>> +    size_t i;
>>>> +    size_t group_size;
>>>> +    ompi_proc_t *proc;
>>>> +
>>>> +    /* if there is less than 2 procs, there's nothing to do */
>>>> +    if ((group_size = ompi_group_size(group))<   2) {
>>>> +        goto out;
>>>> +    }
>>>> +    else if (NULL == (procs = (ompi_proc_t **)
>>>> +                              malloc(sizeof(ompi_proc_t *) * 
>>>> group_size))) {
>>>> +        ORTE_ERROR_LOG(OMPI_ERR_OUT_OF_RESOURCE);
>>>> +        goto out;
>>>> +    }
>>>> +    /* make sure that all the procs in the group are local */
>>>> +    for (i = 0; i<   group_size; ++i) {
>>>> +        proc = ompi_group_peer_lookup(group, i);
>>>> +        if (!OPAL_PROC_ON_LOCAL_NODE(proc->proc_flags)) {
>>>> +            goto out;
>>>> +        }
>>>> +        procs[i] = proc;
>>>> +    }
>>>> +    /* let mca_common_sm_init take care of the rest ... */
>>>> +    ret = mca_common_sm_init(procs, group_size, size, file_name,
>>>> +                             size_ctl_structure, data_seg_alignment);
>>>> +out:
>>>> +    if (NULL != procs) {
>>>> +        free(procs);
>>>> +    }
>>>> +    return ret;
>>>> +}
>>>> +
>>>> +/* 
>>>> ////////////////////////////////////////////////////////////////////////// 
>>>> */
>>>> +/**
>>>> + *  allocate memory from a previously allocated shared memory
>>>> + *  block.
>>>> + *
>>>> + *  @param size size of request, in bytes (IN)
>>>> + *
>>>> + *  @retval addr virtual address
>>>> + */
>>>> +void *
>>>> +mca_common_sm_seg_alloc(struct mca_mpool_base_module_t *mpool,
>>>> +                        size_t *size,
>>>> +                        mca_mpool_base_registration_t **registration)
>>>> +{
>>>> +    mca_mpool_sm_module_t *sm_module = (mca_mpool_sm_module_t *)mpool;
>>>> +    mca_common_sm_seg_header_t* seg = 
>>>> sm_module->sm_common_module->module_seg;
>>>> +    void *addr;
>>>> +
>>>> +    opal_atomic_lock(&seg->seg_lock);
>>>> +    if (seg->seg_offset + *size>   seg->seg_size) {
>>>> +        addr = NULL;
>>>> +    }
>>>> +    else {
>>>> +        size_t fixup;
>>>> +
>>>> +        /* add base address to segment offset */
>>>> +        addr = sm_module->sm_common_module->module_data_addr + 
>>>> seg->seg_offset;
>>>> +        seg->seg_offset += *size;
>>>> +
>>>> +        /* fix up seg_offset so next allocation is aligned on a
>>>> +         * sizeof(long) boundry.  Do it here so that we don't have to
>>>> +         * check before checking remaining size in buffer
>>>> +         */
>>>> +        if ((fixup = (seg->seg_offset&   (sizeof(long) - 1)))>   0) {
>>>> +            seg->seg_offset += sizeof(long) - fixup;
>>>> +        }
>>>> +    }
>>>> +    if (NULL != registration) {
>>>> +        *registration = NULL;
>>>> +    }
>>>> +    opal_atomic_unlock(&seg->seg_lock);
>>>> +    return addr;
>>>> +}
>>>> +
>>>> +/* 
>>>> ////////////////////////////////////////////////////////////////////////// 
>>>> */
>>>> +int
>>>> +mca_common_sm_fini(mca_common_sm_module_t *mca_common_sm_module)
>>>> +{
>>>> +    int rc = OMPI_SUCCESS;
>>>> +
>>>> +    if (NULL != mca_common_sm_module->module_seg) {
>>>> +        if (OPAL_SUCCESS !=
>>>> +            opal_shmem_segment_detach(&mca_common_sm_module->shmem_ds)) {
>>>> +            rc = OMPI_ERROR;
>>>> +        }
>>>> +    }
>>>> +    return rc;
>>>> +}
>>>> +
>>>> 
>>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm.h
>>>> ==============================================================================
>>>> --- (empty file)
>>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm.h   2011-12-13 16:30:53 EST 
>>>> (Tue, 13 Dec 2011)
>>>> @@ -0,0 +1,163 @@
>>>> +/*
>>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>>> + *                         University Research and Technology
>>>> + *                         Corporation.  All rights reserved.
>>>> + * Copyright (c) 2004-2005 The University of Tennessee and The University
>>>> + *                         of Tennessee Research Foundation.  All rights
>>>> + *                         reserved.
>>>> + * Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
>>>> + *                         University of Stuttgart.  All rights reserved.
>>>> + * Copyright (c) 2004-2005 The Regents of the University of California.
>>>> + *                         All rights reserved.
>>>> + * Copyright (c) 2009-2010 Cisco Systems, Inc.  All rights reserved.
>>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC.
>>>> + *                         All rights reserved.
>>>> + * $COPYRIGHT$
>>>> + *
>>>> + * Additional copyrights may follow
>>>> + *
>>>> + * $HEADER$
>>>> + */
>>>> +
>>>> +#ifndef _COMMON_SM_H_
>>>> +#define _COMMON_SM_H_
>>>> +
>>>> +#include "ompi_config.h"
>>>> +
>>>> +#include "opal/mca/mca.h"
>>>> +#include "opal/class/opal_object.h"
>>>> +#include "opal/class/opal_list.h"
>>>> +#include "opal/sys/atomic.h"
>>>> +#include "opal/mca/shmem/shmem.h"
>>>> +
>>>> +#include "ompi/mca/mpool/mpool.h"
>>>> +#include "ompi/proc/proc.h"
>>>> +#include "ompi/group/group.h"
>>>> +#include "ompi/mca/btl/base/base.h"
>>>> +#include "ompi/mca/btl/base/btl_base_error.h"
>>>> +
>>>> +BEGIN_C_DECLS
>>>> +
>>>> +struct mca_mpool_base_module_t;
>>>> +
>>>> +typedef struct mca_common_sm_seg_header_t {
>>>> +    /* lock to control atomic access */
>>>> +    opal_atomic_lock_t seg_lock;
>>>> +    /* indicates whether or not the segment is ready for use */
>>>> +    volatile int32_t seg_inited;
>>>> +    /* number of local processes that are attached to the shared memory 
>>>> segment.
>>>> +     * this is primarily used as a way of determining whether or not it 
>>>> is safe
>>>> +     * to unlink the shared memory backing store. for example, once 
>>>> seg_att
>>>> +     * is equal to the number of local processes, then we can safely 
>>>> unlink.
>>>> +     */
>>>> +    volatile size_t seg_num_procs_inited;
>>>> +    /* offset to next available memory location available for allocation 
>>>> */
>>>> +    size_t seg_offset;
>>>> +    /* total size of the segment */
>>>> +    size_t seg_size;
>>>> +} mca_common_sm_seg_header_t;
>>>> +
>>>> +typedef struct mca_common_sm_module_t {
>>>> +    /* double link list element */
>>>> +    opal_list_item_t module_item;
>>>> +    /* pointer to header embedded in the shared memory segment */
>>>> +    mca_common_sm_seg_header_t *module_seg;
>>>> +    /* base address of the segment */
>>>> +    unsigned char *module_seg_addr;
>>>> +    /* base address of data segment */
>>>> +    unsigned char *module_data_addr;
>>>> +    /* shared memory backing facility object that encapsulates shmem info 
>>>> */
>>>> +    opal_shmem_ds_t shmem_ds;
>>>> +} mca_common_sm_module_t;
>>>> +
>>>> +OBJ_CLASS_DECLARATION(mca_common_sm_module_t);
>>>> +
>>>> +/**
>>>> + *  This routine is used to set up a shared memory segment (whether
>>>> + *  it's an mmaped file or a SYSV IPC segment).  It is assumed that
>>>> + *  the shared memory segment does not exist before any of the current
>>>> + *  set of processes try and open it.
>>>> + *
>>>> + *  @param procs - array of (ompi_proc_t *)'s to create this shared
>>>> + *  memory segment for.  This array must be writable; it may be edited
>>>> + *  (in undefined ways) if the array contains procs that are not on
>>>> + *  this host.  It is assumed that the caller will simply free this
>>>> + *  array upon return.  (INOUT)
>>>> + *
>>>> + *  @param num_procs - length of the procs array (IN)
>>>> + *
>>>> + *  @param size - size of the segment, in bytes (IN)
>>>> + *
>>>> + *  @param name - unique string identifier of this segment (IN)
>>>> + *
>>>> + *  @param size_ctl_structure  size of the control structure at
>>>> + *                             the head of the segment. The control 
>>>> structure
>>>> + *                             is assumed to have 
>>>> mca_common_sm_seg_header_t
>>>> + *                             as its first segment (IN)
>>>> + *
>>>> + *  @param data_set_alignment  alignment of the data segment.  this
>>>> + *                             follows the control structure.  If this
>>>> + *                             value if 0, then assume that there will
>>>> + *                             be no data segment following the control
>>>> + *                             structure. (IN)
>>>> + *
>>>> + *  @returnvalue pointer to control structure at head of shared memory 
>>>> segment.
>>>> + */
>>>> +OMPI_DECLSPEC extern mca_common_sm_module_t *
>>>> +mca_common_sm_init(ompi_proc_t **procs,
>>>> +                   size_t num_procs,
>>>> +                   size_t size,
>>>> +                   char *file_name,
>>>> +                   size_t size_ctl_structure,
>>>> +                   size_t data_seg_alignment);
>>>> +
>>>> +/**
>>>> + * This routine is used to set up a shared memory segment (whether
>>>> + * it's an mmaped file or a SYSV IPC segment).  It is assumed that
>>>> + * the shared memory segment does not exist before any of the current
>>>> + * set of processes try and open it.
>>>> + *
>>>> + * This routine is the same as mca_common_sm_mmap_init() except that
>>>> + * it takes an (ompi_group_t *) parameter to specify the peers rather
>>>> + * than an array of procs.  Unlike mca_common_sm_mmap_init(), the
>>>> + * group must contain *only* local peers, or this function will return
>>>> + * NULL and not create any shared memory segment.
>>>> + */
>>>> +OMPI_DECLSPEC extern mca_common_sm_module_t *
>>>> +mca_common_sm_init_group(ompi_group_t *group,
>>>> +                         size_t size,
>>>> +                         char *file_name,
>>>> +                         size_t size_ctl_structure,
>>>> +                         size_t data_seg_alignment);
>>>> +
>>>> +/**
>>>> + * callback from the sm mpool
>>>> + */
>>>> +OMPI_DECLSPEC extern void *
>>>> +mca_common_sm_seg_alloc(struct mca_mpool_base_module_t *mpool,
>>>> +                        size_t* size,
>>>> +                        mca_mpool_base_registration_t **registration);
>>>> +
>>>> +/**
>>>> + * This function will release all local resources attached to the
>>>> + * shared memory segment. We assume that the operating system will
>>>> + * release the memory resources when the last process release it.
>>>> + *
>>>> + * @param mca_common_sm_module - instance that is shared between
>>>> + *                               components that use shared memory.
>>>> + *
>>>> + * @return OMPI_SUCCESS if everything was okay, otherwise return 
>>>> OMPI_ERROR.
>>>> + */
>>>> +
>>>> +OMPI_DECLSPEC extern int
>>>> +mca_common_sm_fini(mca_common_sm_module_t *mca_common_sm_module);
>>>> +
>>>> +/**
>>>> + * instance that is shared between components that use shared memory.
>>>> + */
>>>> +OMPI_DECLSPEC extern mca_common_sm_module_t *mca_common_sm_module;
>>>> +
>>>> +END_C_DECLS
>>>> +
>>>> +#endif /* _COMMON_SM_H_ */
>>>> +
>>>> 
>>>> Deleted: branches/v1.5/ompi/mca/common/sm/common_sm_mmap.c
>>>> ==============================================================================
>>>> 
>>>> Deleted: branches/v1.5/ompi/mca/common/sm/common_sm_mmap.h
>>>> ==============================================================================
>>>> 
>>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm_rml.c
>>>> ==============================================================================
>>>> --- (empty file)
>>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm_rml.c       2011-12-13 
>>>> 16:30:53 EST (Tue, 13 Dec 2011)
>>>> @@ -0,0 +1,154 @@
>>>> +/*
>>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>>> + *                         University Research and Technology
>>>> + *                         Corporation.  All rights reserved.
>>>> + * Copyright (c) 2004-2011 The University of Tennessee and The University
>>>> + *                         of Tennessee Research Foundation.  All rights
>>>> + *                         reserved.
>>>> + * Copyright (c) 2004-2009 High Performance Computing Center Stuttgart,
>>>> + *                         University of Stuttgart.  All rights reserved.
>>>> + * Copyright (c) 2004-2005 The Regents of the University of California.
>>>> + *                         All rights reserved.
>>>> + * Copyright (c) 2007      Sun Microsystems, Inc.  All rights reserved.
>>>> + * Copyright (c) 2008-2010 Cisco Systems, Inc.  All rights reserved.
>>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC.
>>>> + *                         All rights reserved.
>>>> + * $COPYRIGHT$
>>>> + *
>>>> + * Additional copyrights may follow
>>>> + *
>>>> + * $HEADER$
>>>> + */
>>>> +
>>>> +#include "ompi_config.h"
>>>> +
>>>> +#ifdef HAVE_STRING_H
>>>> +#include<string.h>
>>>> +#endif
>>>> +
>>>> +#include "orte/mca/rml/rml.h"
>>>> +#include "orte/util/name_fns.h"
>>>> +#include "orte/util/show_help.h"
>>>> +#include "orte/runtime/orte_globals.h"
>>>> +#include "orte/mca/errmgr/errmgr.h"
>>>> +
>>>> +#include "ompi/constants.h"
>>>> +#include "ompi/mca/dpm/dpm.h"
>>>> +#include "ompi/mca/common/sm/common_sm_rml.h"
>>>> +
>>>> +OBJ_CLASS_INSTANCE(
>>>> +    mca_common_sm_rml_pending_rml_msg_types_t,
>>>> +    opal_object_t,
>>>> +    NULL,
>>>> +    NULL
>>>> +);
>>>> +
>>>> +/* 
>>>> ////////////////////////////////////////////////////////////////////////// 
>>>> */
>>>> +/**
>>>> + * this routine assumes that sorted_procs is in the following state:
>>>> + *     o all the local procs at the beginning.
>>>> + *     o sorted_procs[0] is the lowest named process.
>>>> + */
>>>> +int
>>>> +mca_common_sm_rml_info_bcast(opal_shmem_ds_t *ds_buf,
>>>> +                             ompi_proc_t **procs,
>>>> +                             size_t num_procs,
>>>> +                             int tag,
>>>> +                             bool bcast_root,
>>>> +                             char *msg_id_str,
>>>> +                             opal_list_t *pending_rml_msgs)
>>>> +{
>>>> +    int rc = OMPI_SUCCESS;
>>>> +    struct iovec iov[MCA_COMMON_SM_RML_MSG_LEN];
>>>> +    int iovrc;
>>>> +    size_t p;
>>>> +    char msg_id_str_to_tx[OPAL_PATH_MAX];
>>>> +
>>>> +    strncpy(msg_id_str_to_tx, msg_id_str, sizeof(msg_id_str_to_tx) - 1);
>>>> +
>>>> +    /* let the first item be the queueing id name */
>>>> +    iov[0].iov_base = (ompi_iov_base_ptr_t)msg_id_str_to_tx;
>>>> +    iov[0].iov_len = sizeof(msg_id_str_to_tx);
>>>> +    iov[1].iov_base = (ompi_iov_base_ptr_t)ds_buf;
>>>> +    iov[1].iov_len = sizeof(opal_shmem_ds_t);
>>>> +
>>>> +    /* figure out if i am the root proc in the group.
>>>> +     * if i am, bcast the message the rest of the local procs.
>>>> +     */
>>>> +    if (bcast_root) {
>>>> +        opal_progress_event_users_increment();
>>>> +        /* first num_procs items should be local procs */
>>>> +        for (p = 1; p<   num_procs; ++p) {
>>>> +            iovrc = orte_rml.send(&(procs[p]->proc_name), iov,
>>>> +                                  MCA_COMMON_SM_RML_MSG_LEN, tag, 0);
>>>> +            if ((ssize_t)(iov[0].iov_len + iov[1].iov_len)>   iovrc) {
>>>> +                ORTE_ERROR_LOG(ORTE_ERR_COMM_FAILURE);
>>>> +                opal_progress_event_users_decrement();
>>>> +                rc = OMPI_ERROR;
>>>> +                goto out;
>>>> +            }
>>>> +        }
>>>> +        opal_progress_event_users_decrement();
>>>> +    }
>>>> +    else { /* i am NOT the root ("lowest") proc */
>>>> +        opal_list_item_t *item;
>>>> +        mca_common_sm_rml_pending_rml_msg_types_t *rml_msg;
>>>> +        /* because a component query can be performed simultaneously in 
>>>> multiple
>>>> +         * threads, the RML messages may arrive in any order.  so first 
>>>> check to
>>>> +         * see if we previously received a message for me.
>>>> +         */
>>>> +        for (item = opal_list_get_first(pending_rml_msgs);
>>>> +             opal_list_get_end(pending_rml_msgs) != item;
>>>> +             item = opal_list_get_next(item)) {
>>>> +            rml_msg = (mca_common_sm_rml_pending_rml_msg_types_t *)item;
>>>> +            /* was the message for me? */
>>>> +            if (0 == strcmp(rml_msg->msg_id_str, msg_id_str)) {
>>>> +                opal_list_remove_item(pending_rml_msgs, item);
>>>> +                /*                 from ==============>   to */
>>>> +                opal_shmem_ds_copy(&rml_msg->shmem_ds, ds_buf);
>>>> +                OBJ_RELEASE(item);
>>>> +                break;
>>>> +            }
>>>> +        }
>>>> +        /* if we didn't find a message already waiting, block on 
>>>> receiving from
>>>> +         * the RML.
>>>> +         */
>>>> +        if (opal_list_get_end(pending_rml_msgs) == item) {
>>>> +            do {
>>>> +                /* bump up the libevent polling frequency while we're in 
>>>> this
>>>> +                 * RML recv, just to ensure we're checking libevent 
>>>> frequently.
>>>> +                 */
>>>> +                opal_progress_event_users_increment();
>>>> +                iovrc = orte_rml.recv(&(procs[0]->proc_name), iov,
>>>> +                                      MCA_COMMON_SM_RML_MSG_LEN, tag, 0);
>>>> +                opal_progress_event_users_decrement();
>>>> +                if (iovrc<   0) {
>>>> +                    ORTE_ERROR_LOG(ORTE_ERR_RECV_LESS_THAN_POSTED);
>>>> +                    rc = OMPI_ERROR;
>>>> +                    goto out;
>>>> +                }
>>>> +                /* was the message for me?  if so, we're done */
>>>> +                if (0 == strcmp(msg_id_str_to_tx, msg_id_str)) {
>>>> +                    break;
>>>> +                }
>>>> +                /* if not, put it on the pending list and try again */
>>>> +                if (NULL == (rml_msg =
>>>> +                            
>>>> OBJ_NEW(mca_common_sm_rml_pending_rml_msg_types_t)))
>>>> +                {
>>>> +                    ORTE_ERROR_LOG(OMPI_ERR_OUT_OF_RESOURCE);
>>>> +                    rc = OMPI_ERROR;
>>>> +                    goto out;
>>>> +                }
>>>> +                /* not for me, so place on list */
>>>> +                /*                 from ========>   to */
>>>> +                opal_shmem_ds_copy(ds_buf,&rml_msg->shmem_ds);
>>>> +                memcpy(rml_msg->msg_id_str, msg_id_str_to_tx, 
>>>> OPAL_PATH_MAX);
>>>> +                opal_list_append(pending_rml_msgs,&(rml_msg->super));
>>>> +            } while(1);
>>>> +        }
>>>> +    }
>>>> +
>>>> +out:
>>>> +    return rc;
>>>> +}
>>>> +
>>>> 
>>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm_rml.h
>>>> ==============================================================================
>>>> --- (empty file)
>>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm_rml.h       2011-12-13 
>>>> 16:30:53 EST (Tue, 13 Dec 2011)
>>>> @@ -0,0 +1,65 @@
>>>> +/*
>>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
>>>> + *                         University Research and Technology
>>>> + *                         Corporation.  All rights reserved.
>>>> + * Copyright (c) 2004-2005 The University of Tennessee and The University
>>>> + *                         of Tennessee Research Foundation.  All rights
>>>> + *                         reserved.
>>>> + * Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
>>>> + *                         University of Stuttgart.  All rights reserved.
>>>> + * Copyright (c) 2004-2005 The Regents of the University of California.
>>>> + *                         All rights reserved.
>>>> + * Copyright (c) 2009-2010 Cisco Systems, Inc.  All rights reserved.
>>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC.
>>>> + *                         All rights reserved.
>>>> + * $COPYRIGHT$
>>>> + *
>>>> + * Additional copyrights may follow
>>>> + *
>>>> + * $HEADER$
>>>> + */
>>>> +
>>>> +#ifndef _COMMON_SM_RML_H_
>>>> +#define _COMMON_SM_RML_H_
>>>> +
>>>> +#include "ompi_config.h"
>>>> +
>>>> +#include "opal/mca/mca.h"
>>>> +#include "opal/class/opal_object.h"
>>>> +#include "opal/class/opal_list.h"
>>>> +#include "opal/mca/shmem/base/base.h"
>>>> +#include "opal/mca/shmem/shmem.h"
>>>> +
>>>> +#include "ompi/proc/proc.h"
>>>> +#include "ompi/mca/common/sm/common_sm.h"
>>>> +
>>>> +#define MCA_COMMON_SM_RML_MSG_LEN 2
>>>> +
>>>> +BEGIN_C_DECLS
>>>> +
>>>> +/**
>>>> + * items on the pending_rml_msgs list
>>>> + */
>>>> +typedef struct mca_common_sm_rml_pending_rml_msg_types_t {
>>>> +    opal_list_item_t super;
>>>> +    char msg_id_str[OPAL_PATH_MAX];
>>>> +    opal_shmem_ds_t shmem_ds;
>>>> +} mca_common_sm_rml_pending_rml_msg_types_t;
>>>> +
>>>> +/**
>>>> + * routine used to send common sm initialization information to all local
>>>> + * processes in procs.
>>>> + */
>>>> +OMPI_DECLSPEC extern int
>>>> +mca_common_sm_rml_info_bcast(opal_shmem_ds_t *ds_buf,
>>>> +                             ompi_proc_t **procs,
>>>> +                             size_t num_procs,
>>>> +                             int tag,
>>>> +                             bool bcast_root,
>>>> +                             char *msg_id_str,
>>>> +                             opal_list_t *pending_rml_msgs);
>>>> +
>>>> +END_C_DECLS
>>>> +
>>>> +#endif /* _COMMON_SM_RML_H_*/
>>>> +
>>>> _______________________________________________
>>>> svn-full mailing list
>>>> svn-f...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full
>>>> 
>>> 
>>> -- 
>>> ---------------------------------------------------------------
>>> Shiqing Fan
>>> High Performance Computing Center Stuttgart (HLRS)
>>> Tel: ++49(0)711-685-87234      Nobelstrasse 19
>>> Fax: ++49(0)711-685-65832      70569 Stuttgart
>>> http://www.hlrs.de/organization/people/shiqing-fan/
>>> email: f...@hlrs.de
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 
> 
> -- 
> ---------------------------------------------------------------
> Shiqing Fan
> High Performance Computing Center Stuttgart (HLRS)
> Tel: ++49(0)711-685-87234      Nobelstrasse 19
> Fax: ++49(0)711-685-65832      70569 Stuttgart
> http://www.hlrs.de/organization/people/shiqing-fan/
> email: f...@hlrs.de
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to