I took the liberty of GK ratcheting that CMR through, in the interest of expediency...
On Dec 14, 2011, at 8:15 AM, Shiqing Fan wrote: > I see the real problem now, the .windows file is not added into the tarball. > > On 2011-12-14 1:48 PM, George Bosilca wrote: >> Shiqing, >> >> This file seems to be there. >> >> $ pwd >> /home/bosilca/unstable/1.5/ompi >> >> $ svn info opal/mca/shmem/windows/.windows >> Path: opal/mca/shmem/windows/.windows >> Name: .windows >> URL: >> https://svn.open-mpi.org/svn/ompi/branches/v1.5/opal/mca/shmem/windows/.windows >> Repository Root: https://svn.open-mpi.org/svn/ompi >> Repository UUID: 63e3feb5-37d5-0310-a306-e8a459e722fe >> Revision: 25637 >> Node Kind: file >> Schedule: normal >> Last Changed Author: bosilca >> Last Changed Rev: 25626 >> Last Changed Date: 2011-12-13 12:20:25 -0500 (Tue, 13 Dec 2011) >> Text Last Updated: 2011-12-13 12:20:35 -0500 (Tue, 13 Dec 2011) >> Checksum: ebb6f0135ecdcf7f79d1120046dfb3e6 >> >> george. >> >> On Dec 14, 2011, at 05:36 , Shiqing Fan wrote: >> >>> Hi George, >>> >>> A .windows file seems still missing in opal/mca/shmem/windows/. Could you >>> also svn add it (from the patch in shmem ticket)? >>> >>> It is not a source file, but rather a CMake required configuration file. >>> Probably this change doesn't need another rc. :-) Thanks a lot. >>> >>> >>> Regards, >>> Shiqing >>> >>> On 2011-12-13 10:30 PM, bosi...@osl.iu.edu wrote: >>>> Author: bosilca >>>> Date: 2011-12-13 16:30:53 EST (Tue, 13 Dec 2011) >>>> New Revision: 25627 >>>> URL: https://svn.open-mpi.org/trac/ompi/changeset/25627 >>>> >>>> Log: >>>> Add and remove some of the files needed for the shmem patch. >>>> >>>> Added: >>>> branches/v1.5/ompi/mca/common/sm/common_sm.c >>>> branches/v1.5/ompi/mca/common/sm/common_sm.h >>>> branches/v1.5/ompi/mca/common/sm/common_sm_rml.c >>>> branches/v1.5/ompi/mca/common/sm/common_sm_rml.h >>>> Removed: >>>> branches/v1.5/ompi/mca/common/sm/common_sm_mmap.c >>>> branches/v1.5/ompi/mca/common/sm/common_sm_mmap.h >>>> >>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm.c >>>> ============================================================================== >>>> --- (empty file) >>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm.c 2011-12-13 16:30:53 EST >>>> (Tue, 13 Dec 2011) >>>> @@ -0,0 +1,387 @@ >>>> +/* >>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana >>>> + * University Research and Technology >>>> + * Corporation. All rights reserved. >>>> + * Copyright (c) 2004-2005 The University of Tennessee and The University >>>> + * of Tennessee Research Foundation. All rights >>>> + * reserved. >>>> + * Copyright (c) 2004-2009 High Performance Computing Center Stuttgart, >>>> + * University of Stuttgart. All rights reserved. >>>> + * Copyright (c) 2004-2005 The Regents of the University of California. >>>> + * All rights reserved. >>>> + * Copyright (c) 2007 Sun Microsystems, Inc. All rights reserved. >>>> + * Copyright (c) 2008-2010 Cisco Systems, Inc. All rights reserved. >>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC. >>>> + * All rights reserved. >>>> + * $COPYRIGHT$ >>>> + * >>>> + * Additional copyrights may follow >>>> + * >>>> + * $HEADER$ >>>> + */ >>>> + >>>> +#include "ompi_config.h" >>>> + >>>> +#ifdef HAVE_STRING_H >>>> +#include<string.h> >>>> +#endif >>>> + >>>> +#include "opal/align.h" >>>> +#include "opal/util/argv.h" >>>> +#if OPAL_ENABLE_FT_CR == 1 >>>> +#include "opal/runtime/opal_cr.h" >>>> +#endif >>>> + >>>> +#include "orte/util/name_fns.h" >>>> +#include "orte/util/show_help.h" >>>> +#include "orte/runtime/orte_globals.h" >>>> +#include "orte/mca/errmgr/errmgr.h" >>>> + >>>> +#include "ompi/constants.h" >>>> +#include "ompi/mca/dpm/dpm.h" >>>> +#include "ompi/mca/mpool/sm/mpool_sm.h" >>>> + >>>> +#include "common_sm_rml.h" >>>> + >>>> +/* ASSUMING local process homogeneity with respect to all utilized shared >>>> memory >>>> + * facilities. that is, if one local process deems a particular shared >>>> memory >>>> + * facility acceptable, then ALL local processes should be able to >>>> utilize that >>>> + * facility. as it stands, this is an important point because one process >>>> + * dictates to all other local processes which common sm component will be >>>> + * selected based on its own, local run-time test. >>>> + */ >>>> + >>>> +OBJ_CLASS_INSTANCE( >>>> + mca_common_sm_module_t, >>>> + opal_object_t, >>>> + NULL, >>>> + NULL >>>> +); >>>> + >>>> +/* list of RML messages that have arrived that have not yet been >>>> + * consumed by the thread who is looking to complete its component >>>> + * initialization based on the contents of the RML message. >>>> + */ >>>> +static opal_list_t pending_rml_msgs; >>>> +/* flag indicating whether or not pending_rml_msgs has been initialized */ >>>> +static bool pending_rml_msgs_init = false; >>>> +/* lock to protect multiple instances of mca_common_sm_init() from being >>>> + * invoked simultaneously (because of RML usage). >>>> + */ >>>> +static opal_mutex_t mutex; >>>> +/* shared memory information used for initialization and setup. */ >>>> +static opal_shmem_ds_t shmem_ds; >>>> +/* number of local processes */ >>>> +static size_t num_local_procs = 0; >>>> +/* indicates whether or not i'm the lowest named process */ >>>> +static bool lowest_local_proc = false; >>>> + >>>> +/* >>>> ////////////////////////////////////////////////////////////////////////// >>>> */ >>>> +/* static utility functions */ >>>> +/* >>>> ////////////////////////////////////////////////////////////////////////// >>>> */ >>>> + >>>> +/* >>>> ////////////////////////////////////////////////////////////////////////// >>>> */ >>>> +static mca_common_sm_module_t * >>>> +attach_and_init(const char *file_name, >>>> + size_t size_ctl_structure, >>>> + size_t data_seg_alignment) >>>> +{ >>>> + mca_common_sm_module_t *map = NULL; >>>> + mca_common_sm_seg_header_t *seg = NULL; >>>> + unsigned char *addr = NULL; >>>> + >>>> + /* map the file and initialize segment state */ >>>> + if (NULL == (seg = (mca_common_sm_seg_header_t *) >>>> + opal_shmem_segment_attach(&shmem_ds))) { >>>> + return NULL; >>>> + } >>>> + opal_atomic_rmb(); >>>> + >>>> + /* set up the map object */ >>>> + if (NULL == (map = OBJ_NEW(mca_common_sm_module_t))) { >>>> + ORTE_ERROR_LOG(OMPI_ERR_OUT_OF_RESOURCE); >>>> + return NULL; >>>> + } >>>> + >>>> + /* copy information: from ====> to */ >>>> + opal_shmem_ds_copy(&shmem_ds,&map->shmem_ds); >>>> + >>>> + /* the first entry in the file is the control structure. the first >>>> + * entry in the control structure is an mca_common_sm_seg_header_t >>>> + * element >>>> + */ >>>> + map->module_seg = seg; >>>> + >>>> + addr = ((unsigned char *)seg) + size_ctl_structure; >>>> + /* if we have a data segment (i.e., if 0 != data_seg_alignment), >>>> + * then make it the first aligned address after the control >>>> + * structure. IF THIS HAPPENS, THIS IS A PROGRAMMING ERROR IN >>>> + * OPEN MPI! >>>> + */ >>>> + if (0 != data_seg_alignment) { >>>> + addr = OPAL_ALIGN_PTR(addr, data_seg_alignment, unsigned char *); >>>> + /* is addr past end of the shared memory segment? */ >>>> + if ((unsigned char *)seg + shmem_ds.seg_size< addr) { >>>> + orte_show_help("help-mpi-common-sm.txt", "mmap too small", 1, >>>> + orte_process_info.nodename, >>>> + (unsigned long)shmem_ds.seg_size, >>>> + (unsigned long)size_ctl_structure, >>>> + (unsigned long)data_seg_alignment); >>>> + return NULL; >>>> + } >>>> + } >>>> + >>>> + map->module_data_addr = addr; >>>> + map->module_seg_addr = (unsigned char *)seg; >>>> + >>>> + /* map object successfully initialized - we can safely increment >>>> + * seg_num_procs_attached_and_inited. this value is used by >>>> + * opal_shmem_unlink. >>>> + */ >>>> + (void)opal_atomic_add_size_t(&map->module_seg->seg_num_procs_inited, >>>> 1); >>>> + opal_atomic_wmb(); >>>> + >>>> + return map; >>>> +} >>>> + >>>> +/* >>>> ////////////////////////////////////////////////////////////////////////// >>>> */ >>>> +mca_common_sm_module_t * >>>> +mca_common_sm_init(ompi_proc_t **procs, >>>> + size_t num_procs, >>>> + size_t size, >>>> + char *file_name, >>>> + size_t size_ctl_structure, >>>> + size_t data_seg_alignment) >>>> +{ >>>> + mca_common_sm_module_t *map = NULL; >>>> + bool found_lowest = false; >>>> + size_t p; >>>> + size_t mem_offset; >>>> + ompi_proc_t *temp_proc; >>>> + >>>> + num_local_procs = 0; >>>> + lowest_local_proc = false; >>>> + >>>> + /* o reorder procs array to have all the local procs at the beginning. >>>> + * o look for the local proc with the lowest name. >>>> + * o determine the number of local procs. >>>> + * o ensure that procs[0] is the lowest named process. >>>> + */ >>>> + for (p = 0; p< num_procs; ++p) { >>>> + if (OPAL_PROC_ON_LOCAL_NODE(procs[p]->proc_flags)) { >>>> + /* if we don't have a lowest, save the first one */ >>>> + if (!found_lowest) { >>>> + procs[0] = procs[p]; >>>> + found_lowest = true; >>>> + } >>>> + else { >>>> + /* save this proc */ >>>> + procs[num_local_procs] = procs[p]; >>>> + /* if we have a new lowest, swap it with position 0 >>>> + * so that procs[0] is always the lowest named proc >>>> + */ >>>> + if (orte_util_compare_name_fields(ORTE_NS_CMP_ALL, >>>> +&(procs[p]->proc_name), >>>> +&(procs[0]->proc_name))< 0) { >>>> + temp_proc = procs[0]; >>>> + procs[0] = procs[p]; >>>> + procs[num_local_procs] = temp_proc; >>>> + } >>>> + } >>>> + /* regardless of the comparisons above, we found >>>> + * another proc on the local node, so increment >>>> + */ >>>> + ++num_local_procs; >>>> + } >>>> + } >>>> + >>>> + /* if there is less than 2 local processes, there's nothing to do. */ >>>> + if (num_local_procs< 2) { >>>> + return NULL; >>>> + } >>>> + >>>> + /* determine whether or not i am the lowest local process */ >>>> + lowest_local_proc = (0 == orte_util_compare_name_fields( >>>> + ORTE_NS_CMP_ALL, >>>> + ORTE_PROC_MY_NAME, >>>> +&(procs[0]->proc_name))); >>>> + >>>> + /* lock here to prevent multiple threads from invoking this >>>> + * function simultaneously. the critical section we're protecting >>>> + * is usage of the RML in this block. >>>> + */ >>>> + opal_mutex_lock(&mutex); >>>> + >>>> + if (!pending_rml_msgs_init) { >>>> + OBJ_CONSTRUCT(&(pending_rml_msgs), opal_list_t); >>>> + pending_rml_msgs_init = true; >>>> + } >>>> + /* figure out if i am the lowest rank in the group. >>>> + * if so, i will create the shared memory backing store >>>> + */ >>>> + if (lowest_local_proc) { >>>> + if (OPAL_SUCCESS == opal_shmem_segment_create(&shmem_ds, >>>> file_name, >>>> + size)) { >>>> + map = attach_and_init(file_name, size_ctl_structure, >>>> + data_seg_alignment); >>>> + if (NULL != map) { >>>> + mem_offset = map->module_data_addr - >>>> + (unsigned char *)map->module_seg; >>>> + map->module_seg->seg_offset = mem_offset; >>>> + map->module_seg->seg_size = size - mem_offset; >>>> + opal_atomic_init(&map->module_seg->seg_lock, >>>> + OPAL_ATOMIC_UNLOCKED); >>>> + map->module_seg->seg_inited = 0; >>>> + } >>>> + else { >>>> + /* fail! >>>> + * only invalidate the shmem_ds. doing so will let the >>>> rest >>>> + * of the local processes know that the lowest local rank >>>> + * failed to properly initialize the shared memory >>>> segment, so >>>> + * they should try to carry on without shared memory >>>> support >>>> + */ >>>> + OPAL_SHMEM_DS_INVALIDATE(&shmem_ds); >>>> + } >>>> + } >>>> + } >>>> + >>>> + /* send shmem info to the rest of the local procs. */ >>>> + if (OMPI_SUCCESS != mca_common_sm_rml_info_bcast( >>>> +&shmem_ds, procs, num_local_procs, >>>> + OMPI_RML_TAG_SM_BACK_FILE_CREATED, >>>> + lowest_local_proc, file_name, >>>> +&(pending_rml_msgs))) { >>>> + goto out; >>>> + } >>>> + >>>> + /* are we dealing with a valid shmem_ds? that is, did the lowest >>>> + * process successfully initialize the shared memory segment? >>>> + */ >>>> + if (OPAL_SHMEM_DS_IS_VALID(&shmem_ds)) { >>>> + if (!lowest_local_proc) { >>>> + map = attach_and_init(file_name, size_ctl_structure, >>>> + data_seg_alignment); >>>> + } >>>> + else { >>>> + /* wait until every other participating process has attached >>>> to the >>>> + * shared memory segment. >>>> + */ >>>> + while (num_local_procs> >>>> map->module_seg->seg_num_procs_inited) { >>>> + opal_atomic_rmb(); >>>> + } >>>> + opal_shmem_unlink(&shmem_ds); >>>> + } >>>> + } >>>> + >>>> +out: >>>> + opal_mutex_unlock(&mutex); >>>> + return map; >>>> +} >>>> + >>>> +/* >>>> ////////////////////////////////////////////////////////////////////////// >>>> */ >>>> +/** >>>> + * this routine is the same as mca_common_sm_mmap_init() except that >>>> + * it takes an (ompi_group_t *) parameter to specify the peers rather >>>> + * than an array of procs. unlike mca_common_sm_mmap_init(), the >>>> + * group must contain *only* local peers, or this function will return >>>> + * NULL and not create any shared memory segment. >>>> + */ >>>> +mca_common_sm_module_t * >>>> +mca_common_sm_init_group(ompi_group_t *group, >>>> + size_t size, >>>> + char *file_name, >>>> + size_t size_ctl_structure, >>>> + size_t data_seg_alignment) >>>> +{ >>>> + mca_common_sm_module_t *ret = NULL; >>>> + ompi_proc_t **procs = NULL; >>>> + size_t i; >>>> + size_t group_size; >>>> + ompi_proc_t *proc; >>>> + >>>> + /* if there is less than 2 procs, there's nothing to do */ >>>> + if ((group_size = ompi_group_size(group))< 2) { >>>> + goto out; >>>> + } >>>> + else if (NULL == (procs = (ompi_proc_t **) >>>> + malloc(sizeof(ompi_proc_t *) * >>>> group_size))) { >>>> + ORTE_ERROR_LOG(OMPI_ERR_OUT_OF_RESOURCE); >>>> + goto out; >>>> + } >>>> + /* make sure that all the procs in the group are local */ >>>> + for (i = 0; i< group_size; ++i) { >>>> + proc = ompi_group_peer_lookup(group, i); >>>> + if (!OPAL_PROC_ON_LOCAL_NODE(proc->proc_flags)) { >>>> + goto out; >>>> + } >>>> + procs[i] = proc; >>>> + } >>>> + /* let mca_common_sm_init take care of the rest ... */ >>>> + ret = mca_common_sm_init(procs, group_size, size, file_name, >>>> + size_ctl_structure, data_seg_alignment); >>>> +out: >>>> + if (NULL != procs) { >>>> + free(procs); >>>> + } >>>> + return ret; >>>> +} >>>> + >>>> +/* >>>> ////////////////////////////////////////////////////////////////////////// >>>> */ >>>> +/** >>>> + * allocate memory from a previously allocated shared memory >>>> + * block. >>>> + * >>>> + * @param size size of request, in bytes (IN) >>>> + * >>>> + * @retval addr virtual address >>>> + */ >>>> +void * >>>> +mca_common_sm_seg_alloc(struct mca_mpool_base_module_t *mpool, >>>> + size_t *size, >>>> + mca_mpool_base_registration_t **registration) >>>> +{ >>>> + mca_mpool_sm_module_t *sm_module = (mca_mpool_sm_module_t *)mpool; >>>> + mca_common_sm_seg_header_t* seg = >>>> sm_module->sm_common_module->module_seg; >>>> + void *addr; >>>> + >>>> + opal_atomic_lock(&seg->seg_lock); >>>> + if (seg->seg_offset + *size> seg->seg_size) { >>>> + addr = NULL; >>>> + } >>>> + else { >>>> + size_t fixup; >>>> + >>>> + /* add base address to segment offset */ >>>> + addr = sm_module->sm_common_module->module_data_addr + >>>> seg->seg_offset; >>>> + seg->seg_offset += *size; >>>> + >>>> + /* fix up seg_offset so next allocation is aligned on a >>>> + * sizeof(long) boundry. Do it here so that we don't have to >>>> + * check before checking remaining size in buffer >>>> + */ >>>> + if ((fixup = (seg->seg_offset& (sizeof(long) - 1)))> 0) { >>>> + seg->seg_offset += sizeof(long) - fixup; >>>> + } >>>> + } >>>> + if (NULL != registration) { >>>> + *registration = NULL; >>>> + } >>>> + opal_atomic_unlock(&seg->seg_lock); >>>> + return addr; >>>> +} >>>> + >>>> +/* >>>> ////////////////////////////////////////////////////////////////////////// >>>> */ >>>> +int >>>> +mca_common_sm_fini(mca_common_sm_module_t *mca_common_sm_module) >>>> +{ >>>> + int rc = OMPI_SUCCESS; >>>> + >>>> + if (NULL != mca_common_sm_module->module_seg) { >>>> + if (OPAL_SUCCESS != >>>> + opal_shmem_segment_detach(&mca_common_sm_module->shmem_ds)) { >>>> + rc = OMPI_ERROR; >>>> + } >>>> + } >>>> + return rc; >>>> +} >>>> + >>>> >>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm.h >>>> ============================================================================== >>>> --- (empty file) >>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm.h 2011-12-13 16:30:53 EST >>>> (Tue, 13 Dec 2011) >>>> @@ -0,0 +1,163 @@ >>>> +/* >>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana >>>> + * University Research and Technology >>>> + * Corporation. All rights reserved. >>>> + * Copyright (c) 2004-2005 The University of Tennessee and The University >>>> + * of Tennessee Research Foundation. All rights >>>> + * reserved. >>>> + * Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, >>>> + * University of Stuttgart. All rights reserved. >>>> + * Copyright (c) 2004-2005 The Regents of the University of California. >>>> + * All rights reserved. >>>> + * Copyright (c) 2009-2010 Cisco Systems, Inc. All rights reserved. >>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC. >>>> + * All rights reserved. >>>> + * $COPYRIGHT$ >>>> + * >>>> + * Additional copyrights may follow >>>> + * >>>> + * $HEADER$ >>>> + */ >>>> + >>>> +#ifndef _COMMON_SM_H_ >>>> +#define _COMMON_SM_H_ >>>> + >>>> +#include "ompi_config.h" >>>> + >>>> +#include "opal/mca/mca.h" >>>> +#include "opal/class/opal_object.h" >>>> +#include "opal/class/opal_list.h" >>>> +#include "opal/sys/atomic.h" >>>> +#include "opal/mca/shmem/shmem.h" >>>> + >>>> +#include "ompi/mca/mpool/mpool.h" >>>> +#include "ompi/proc/proc.h" >>>> +#include "ompi/group/group.h" >>>> +#include "ompi/mca/btl/base/base.h" >>>> +#include "ompi/mca/btl/base/btl_base_error.h" >>>> + >>>> +BEGIN_C_DECLS >>>> + >>>> +struct mca_mpool_base_module_t; >>>> + >>>> +typedef struct mca_common_sm_seg_header_t { >>>> + /* lock to control atomic access */ >>>> + opal_atomic_lock_t seg_lock; >>>> + /* indicates whether or not the segment is ready for use */ >>>> + volatile int32_t seg_inited; >>>> + /* number of local processes that are attached to the shared memory >>>> segment. >>>> + * this is primarily used as a way of determining whether or not it >>>> is safe >>>> + * to unlink the shared memory backing store. for example, once >>>> seg_att >>>> + * is equal to the number of local processes, then we can safely >>>> unlink. >>>> + */ >>>> + volatile size_t seg_num_procs_inited; >>>> + /* offset to next available memory location available for allocation >>>> */ >>>> + size_t seg_offset; >>>> + /* total size of the segment */ >>>> + size_t seg_size; >>>> +} mca_common_sm_seg_header_t; >>>> + >>>> +typedef struct mca_common_sm_module_t { >>>> + /* double link list element */ >>>> + opal_list_item_t module_item; >>>> + /* pointer to header embedded in the shared memory segment */ >>>> + mca_common_sm_seg_header_t *module_seg; >>>> + /* base address of the segment */ >>>> + unsigned char *module_seg_addr; >>>> + /* base address of data segment */ >>>> + unsigned char *module_data_addr; >>>> + /* shared memory backing facility object that encapsulates shmem info >>>> */ >>>> + opal_shmem_ds_t shmem_ds; >>>> +} mca_common_sm_module_t; >>>> + >>>> +OBJ_CLASS_DECLARATION(mca_common_sm_module_t); >>>> + >>>> +/** >>>> + * This routine is used to set up a shared memory segment (whether >>>> + * it's an mmaped file or a SYSV IPC segment). It is assumed that >>>> + * the shared memory segment does not exist before any of the current >>>> + * set of processes try and open it. >>>> + * >>>> + * @param procs - array of (ompi_proc_t *)'s to create this shared >>>> + * memory segment for. This array must be writable; it may be edited >>>> + * (in undefined ways) if the array contains procs that are not on >>>> + * this host. It is assumed that the caller will simply free this >>>> + * array upon return. (INOUT) >>>> + * >>>> + * @param num_procs - length of the procs array (IN) >>>> + * >>>> + * @param size - size of the segment, in bytes (IN) >>>> + * >>>> + * @param name - unique string identifier of this segment (IN) >>>> + * >>>> + * @param size_ctl_structure size of the control structure at >>>> + * the head of the segment. The control >>>> structure >>>> + * is assumed to have >>>> mca_common_sm_seg_header_t >>>> + * as its first segment (IN) >>>> + * >>>> + * @param data_set_alignment alignment of the data segment. this >>>> + * follows the control structure. If this >>>> + * value if 0, then assume that there will >>>> + * be no data segment following the control >>>> + * structure. (IN) >>>> + * >>>> + * @returnvalue pointer to control structure at head of shared memory >>>> segment. >>>> + */ >>>> +OMPI_DECLSPEC extern mca_common_sm_module_t * >>>> +mca_common_sm_init(ompi_proc_t **procs, >>>> + size_t num_procs, >>>> + size_t size, >>>> + char *file_name, >>>> + size_t size_ctl_structure, >>>> + size_t data_seg_alignment); >>>> + >>>> +/** >>>> + * This routine is used to set up a shared memory segment (whether >>>> + * it's an mmaped file or a SYSV IPC segment). It is assumed that >>>> + * the shared memory segment does not exist before any of the current >>>> + * set of processes try and open it. >>>> + * >>>> + * This routine is the same as mca_common_sm_mmap_init() except that >>>> + * it takes an (ompi_group_t *) parameter to specify the peers rather >>>> + * than an array of procs. Unlike mca_common_sm_mmap_init(), the >>>> + * group must contain *only* local peers, or this function will return >>>> + * NULL and not create any shared memory segment. >>>> + */ >>>> +OMPI_DECLSPEC extern mca_common_sm_module_t * >>>> +mca_common_sm_init_group(ompi_group_t *group, >>>> + size_t size, >>>> + char *file_name, >>>> + size_t size_ctl_structure, >>>> + size_t data_seg_alignment); >>>> + >>>> +/** >>>> + * callback from the sm mpool >>>> + */ >>>> +OMPI_DECLSPEC extern void * >>>> +mca_common_sm_seg_alloc(struct mca_mpool_base_module_t *mpool, >>>> + size_t* size, >>>> + mca_mpool_base_registration_t **registration); >>>> + >>>> +/** >>>> + * This function will release all local resources attached to the >>>> + * shared memory segment. We assume that the operating system will >>>> + * release the memory resources when the last process release it. >>>> + * >>>> + * @param mca_common_sm_module - instance that is shared between >>>> + * components that use shared memory. >>>> + * >>>> + * @return OMPI_SUCCESS if everything was okay, otherwise return >>>> OMPI_ERROR. >>>> + */ >>>> + >>>> +OMPI_DECLSPEC extern int >>>> +mca_common_sm_fini(mca_common_sm_module_t *mca_common_sm_module); >>>> + >>>> +/** >>>> + * instance that is shared between components that use shared memory. >>>> + */ >>>> +OMPI_DECLSPEC extern mca_common_sm_module_t *mca_common_sm_module; >>>> + >>>> +END_C_DECLS >>>> + >>>> +#endif /* _COMMON_SM_H_ */ >>>> + >>>> >>>> Deleted: branches/v1.5/ompi/mca/common/sm/common_sm_mmap.c >>>> ============================================================================== >>>> >>>> Deleted: branches/v1.5/ompi/mca/common/sm/common_sm_mmap.h >>>> ============================================================================== >>>> >>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm_rml.c >>>> ============================================================================== >>>> --- (empty file) >>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm_rml.c 2011-12-13 >>>> 16:30:53 EST (Tue, 13 Dec 2011) >>>> @@ -0,0 +1,154 @@ >>>> +/* >>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana >>>> + * University Research and Technology >>>> + * Corporation. All rights reserved. >>>> + * Copyright (c) 2004-2011 The University of Tennessee and The University >>>> + * of Tennessee Research Foundation. All rights >>>> + * reserved. >>>> + * Copyright (c) 2004-2009 High Performance Computing Center Stuttgart, >>>> + * University of Stuttgart. All rights reserved. >>>> + * Copyright (c) 2004-2005 The Regents of the University of California. >>>> + * All rights reserved. >>>> + * Copyright (c) 2007 Sun Microsystems, Inc. All rights reserved. >>>> + * Copyright (c) 2008-2010 Cisco Systems, Inc. All rights reserved. >>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC. >>>> + * All rights reserved. >>>> + * $COPYRIGHT$ >>>> + * >>>> + * Additional copyrights may follow >>>> + * >>>> + * $HEADER$ >>>> + */ >>>> + >>>> +#include "ompi_config.h" >>>> + >>>> +#ifdef HAVE_STRING_H >>>> +#include<string.h> >>>> +#endif >>>> + >>>> +#include "orte/mca/rml/rml.h" >>>> +#include "orte/util/name_fns.h" >>>> +#include "orte/util/show_help.h" >>>> +#include "orte/runtime/orte_globals.h" >>>> +#include "orte/mca/errmgr/errmgr.h" >>>> + >>>> +#include "ompi/constants.h" >>>> +#include "ompi/mca/dpm/dpm.h" >>>> +#include "ompi/mca/common/sm/common_sm_rml.h" >>>> + >>>> +OBJ_CLASS_INSTANCE( >>>> + mca_common_sm_rml_pending_rml_msg_types_t, >>>> + opal_object_t, >>>> + NULL, >>>> + NULL >>>> +); >>>> + >>>> +/* >>>> ////////////////////////////////////////////////////////////////////////// >>>> */ >>>> +/** >>>> + * this routine assumes that sorted_procs is in the following state: >>>> + * o all the local procs at the beginning. >>>> + * o sorted_procs[0] is the lowest named process. >>>> + */ >>>> +int >>>> +mca_common_sm_rml_info_bcast(opal_shmem_ds_t *ds_buf, >>>> + ompi_proc_t **procs, >>>> + size_t num_procs, >>>> + int tag, >>>> + bool bcast_root, >>>> + char *msg_id_str, >>>> + opal_list_t *pending_rml_msgs) >>>> +{ >>>> + int rc = OMPI_SUCCESS; >>>> + struct iovec iov[MCA_COMMON_SM_RML_MSG_LEN]; >>>> + int iovrc; >>>> + size_t p; >>>> + char msg_id_str_to_tx[OPAL_PATH_MAX]; >>>> + >>>> + strncpy(msg_id_str_to_tx, msg_id_str, sizeof(msg_id_str_to_tx) - 1); >>>> + >>>> + /* let the first item be the queueing id name */ >>>> + iov[0].iov_base = (ompi_iov_base_ptr_t)msg_id_str_to_tx; >>>> + iov[0].iov_len = sizeof(msg_id_str_to_tx); >>>> + iov[1].iov_base = (ompi_iov_base_ptr_t)ds_buf; >>>> + iov[1].iov_len = sizeof(opal_shmem_ds_t); >>>> + >>>> + /* figure out if i am the root proc in the group. >>>> + * if i am, bcast the message the rest of the local procs. >>>> + */ >>>> + if (bcast_root) { >>>> + opal_progress_event_users_increment(); >>>> + /* first num_procs items should be local procs */ >>>> + for (p = 1; p< num_procs; ++p) { >>>> + iovrc = orte_rml.send(&(procs[p]->proc_name), iov, >>>> + MCA_COMMON_SM_RML_MSG_LEN, tag, 0); >>>> + if ((ssize_t)(iov[0].iov_len + iov[1].iov_len)> iovrc) { >>>> + ORTE_ERROR_LOG(ORTE_ERR_COMM_FAILURE); >>>> + opal_progress_event_users_decrement(); >>>> + rc = OMPI_ERROR; >>>> + goto out; >>>> + } >>>> + } >>>> + opal_progress_event_users_decrement(); >>>> + } >>>> + else { /* i am NOT the root ("lowest") proc */ >>>> + opal_list_item_t *item; >>>> + mca_common_sm_rml_pending_rml_msg_types_t *rml_msg; >>>> + /* because a component query can be performed simultaneously in >>>> multiple >>>> + * threads, the RML messages may arrive in any order. so first >>>> check to >>>> + * see if we previously received a message for me. >>>> + */ >>>> + for (item = opal_list_get_first(pending_rml_msgs); >>>> + opal_list_get_end(pending_rml_msgs) != item; >>>> + item = opal_list_get_next(item)) { >>>> + rml_msg = (mca_common_sm_rml_pending_rml_msg_types_t *)item; >>>> + /* was the message for me? */ >>>> + if (0 == strcmp(rml_msg->msg_id_str, msg_id_str)) { >>>> + opal_list_remove_item(pending_rml_msgs, item); >>>> + /* from ==============> to */ >>>> + opal_shmem_ds_copy(&rml_msg->shmem_ds, ds_buf); >>>> + OBJ_RELEASE(item); >>>> + break; >>>> + } >>>> + } >>>> + /* if we didn't find a message already waiting, block on >>>> receiving from >>>> + * the RML. >>>> + */ >>>> + if (opal_list_get_end(pending_rml_msgs) == item) { >>>> + do { >>>> + /* bump up the libevent polling frequency while we're in >>>> this >>>> + * RML recv, just to ensure we're checking libevent >>>> frequently. >>>> + */ >>>> + opal_progress_event_users_increment(); >>>> + iovrc = orte_rml.recv(&(procs[0]->proc_name), iov, >>>> + MCA_COMMON_SM_RML_MSG_LEN, tag, 0); >>>> + opal_progress_event_users_decrement(); >>>> + if (iovrc< 0) { >>>> + ORTE_ERROR_LOG(ORTE_ERR_RECV_LESS_THAN_POSTED); >>>> + rc = OMPI_ERROR; >>>> + goto out; >>>> + } >>>> + /* was the message for me? if so, we're done */ >>>> + if (0 == strcmp(msg_id_str_to_tx, msg_id_str)) { >>>> + break; >>>> + } >>>> + /* if not, put it on the pending list and try again */ >>>> + if (NULL == (rml_msg = >>>> + >>>> OBJ_NEW(mca_common_sm_rml_pending_rml_msg_types_t))) >>>> + { >>>> + ORTE_ERROR_LOG(OMPI_ERR_OUT_OF_RESOURCE); >>>> + rc = OMPI_ERROR; >>>> + goto out; >>>> + } >>>> + /* not for me, so place on list */ >>>> + /* from ========> to */ >>>> + opal_shmem_ds_copy(ds_buf,&rml_msg->shmem_ds); >>>> + memcpy(rml_msg->msg_id_str, msg_id_str_to_tx, >>>> OPAL_PATH_MAX); >>>> + opal_list_append(pending_rml_msgs,&(rml_msg->super)); >>>> + } while(1); >>>> + } >>>> + } >>>> + >>>> +out: >>>> + return rc; >>>> +} >>>> + >>>> >>>> Added: branches/v1.5/ompi/mca/common/sm/common_sm_rml.h >>>> ============================================================================== >>>> --- (empty file) >>>> +++ branches/v1.5/ompi/mca/common/sm/common_sm_rml.h 2011-12-13 >>>> 16:30:53 EST (Tue, 13 Dec 2011) >>>> @@ -0,0 +1,65 @@ >>>> +/* >>>> + * Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana >>>> + * University Research and Technology >>>> + * Corporation. All rights reserved. >>>> + * Copyright (c) 2004-2005 The University of Tennessee and The University >>>> + * of Tennessee Research Foundation. All rights >>>> + * reserved. >>>> + * Copyright (c) 2004-2005 High Performance Computing Center Stuttgart, >>>> + * University of Stuttgart. All rights reserved. >>>> + * Copyright (c) 2004-2005 The Regents of the University of California. >>>> + * All rights reserved. >>>> + * Copyright (c) 2009-2010 Cisco Systems, Inc. All rights reserved. >>>> + * Copyright (c) 2010-2011 Los Alamos National Security, LLC. >>>> + * All rights reserved. >>>> + * $COPYRIGHT$ >>>> + * >>>> + * Additional copyrights may follow >>>> + * >>>> + * $HEADER$ >>>> + */ >>>> + >>>> +#ifndef _COMMON_SM_RML_H_ >>>> +#define _COMMON_SM_RML_H_ >>>> + >>>> +#include "ompi_config.h" >>>> + >>>> +#include "opal/mca/mca.h" >>>> +#include "opal/class/opal_object.h" >>>> +#include "opal/class/opal_list.h" >>>> +#include "opal/mca/shmem/base/base.h" >>>> +#include "opal/mca/shmem/shmem.h" >>>> + >>>> +#include "ompi/proc/proc.h" >>>> +#include "ompi/mca/common/sm/common_sm.h" >>>> + >>>> +#define MCA_COMMON_SM_RML_MSG_LEN 2 >>>> + >>>> +BEGIN_C_DECLS >>>> + >>>> +/** >>>> + * items on the pending_rml_msgs list >>>> + */ >>>> +typedef struct mca_common_sm_rml_pending_rml_msg_types_t { >>>> + opal_list_item_t super; >>>> + char msg_id_str[OPAL_PATH_MAX]; >>>> + opal_shmem_ds_t shmem_ds; >>>> +} mca_common_sm_rml_pending_rml_msg_types_t; >>>> + >>>> +/** >>>> + * routine used to send common sm initialization information to all local >>>> + * processes in procs. >>>> + */ >>>> +OMPI_DECLSPEC extern int >>>> +mca_common_sm_rml_info_bcast(opal_shmem_ds_t *ds_buf, >>>> + ompi_proc_t **procs, >>>> + size_t num_procs, >>>> + int tag, >>>> + bool bcast_root, >>>> + char *msg_id_str, >>>> + opal_list_t *pending_rml_msgs); >>>> + >>>> +END_C_DECLS >>>> + >>>> +#endif /* _COMMON_SM_RML_H_*/ >>>> + >>>> _______________________________________________ >>>> svn-full mailing list >>>> svn-f...@open-mpi.org >>>> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full >>>> >>> >>> -- >>> --------------------------------------------------------------- >>> Shiqing Fan >>> High Performance Computing Center Stuttgart (HLRS) >>> Tel: ++49(0)711-685-87234 Nobelstrasse 19 >>> Fax: ++49(0)711-685-65832 70569 Stuttgart >>> http://www.hlrs.de/organization/people/shiqing-fan/ >>> email: f...@hlrs.de >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > > -- > --------------------------------------------------------------- > Shiqing Fan > High Performance Computing Center Stuttgart (HLRS) > Tel: ++49(0)711-685-87234 Nobelstrasse 19 > Fax: ++49(0)711-685-65832 70569 Stuttgart > http://www.hlrs.de/organization/people/shiqing-fan/ > email: f...@hlrs.de > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/