Re: [OMPI devel] GNU Automake 1.14 released
Hi, On Tuesday 03 September 2013 16:01:30 Ralph Castain wrote: > I still don't see an issue with just detecting the version of automake being > used, and setting a conditional that indicates whether or not to use > explicitly include the subdir. Seems like a pretty trivial solution. Ralph, sorry, we don't understand your proposal. The warnings will be generated at automake time. An AM_CONDITIONAL wont help us here. > On Sep 3, 2013, at 3:49 PM, "Jeff Squyres (jsquyres)" wrote: > > On Sep 3, 2013, at 6:45 PM, Fabrício Zimmerer Murta wrote: > >> I think autotools has a concept of disallowing symlinks as it seems > >> symlinks can't be done in a portable way, and the goal of autotools is > >> making projects portable. > >> > >> Well, if the autotools user feels like using symlinks, then it must be > >> expected to break portability wherever you take your autoconfiscated > >> code to. A choice to the user. Maybe in the case, as the project is > >> bound to specific compilers, it would not be a problem to loose > >> portability a bit more by considering symbolic linking around.> > > Fair enough. > > > > We've been using sym links in the OMPI project for years in order to > > compile a series of .c files in 2 different ways. It's portable to all > > the places that we need/want it. Jeff, I think you mean the $(LN_S) loops for the PMPI interface. We will have a look into this. Thanks. - Bert > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Matthias Jurenz Technische Universität Dresden Center for Information Services and High Performance Computing (ZIH) 01062 Dresden, Germany Phone: +49 (351) 463-31945 Fax: +49 (351) 463-37773 E-Mail: matthias.jur...@tu-dresden.de
[OMPI devel] Inconsistent description of btl_openib_eager_rdma_num parameter in FAQ
Hi, for quite a long time already, there is a confusing inconsistency of the description of the OpenIB parameters related to eager messages on this FAQ site (see details below): http://www.open-mpi.org/faq/?category=openfabrics Does somebody here has the necessary permissions to fix this? The answer to question 20 says (last row in the table): "Each MPI process will use RDMA buffers for eager fragments up to btl_openib_eager_rdma_num MPI peers. Upon receiving the btl_openib_eager_rdma_threshhold'th message from an MPI peer process, if both sides have not yet setup btl_openib_eager_rdma_num sets of eager RDMA buffers, a new set will be created. The set will contain btl_openib_max_eager_rdma buffers; each buffer will be btl_openib_eager_limit bytes (i.e., the maximum size of an eager fragment)." while part of the answer of question 24 says the following: * btl_openib_max_eager_rdma (default value: 16): This parameter controls the maximum number of peers that can receive and RDMA connection for short messages. It is not advisable to change this value to a very large number because the polling time increase with the number of the connections; as a direct result, short message latency will increase. * btl_openib_eager_rdma_num (default value: 16): This parameter controls the maximum number of pre-allocated buffers allocated to each peer for small messages. - Lars -- Lars Schaefers Computer Engineering Group of Prof. Dr. Marco Platzner Paderborn Center for Parallel Computing, University of Paderborn Pohlweg 47-49, 33098 Paderborn, Germany Tel: +49 (0)5251 60 4341, Fax: +49 (0)5251 60 5377 Office: Building O 3.119
Re: [OMPI devel] Possible OMPI 1.6.5 bug? SEGV in malloc.c
Hmm. I don't know how to proceed here. I don't doubt that this is happening to you, but I'm unable to reproduce it. :-\ Can you install a segv handler to simply write(0,...) and sleep() so that you can attach a debugger to a live process when this happens, and poke around a bit? You might get more information from a live process than a corefile. For example, "remainder" comes form chunk_at_offset(p, nb), so it might be interesting to look at that routine and see if something is going wrong in there...? On Sep 4, 2013, at 3:15 AM, Christopher Samuel wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > On 04/09/13 04:47, Jeff Squyres (jsquyres) wrote: > >> Hmm. Are you building Open MPI in a special way? I ask because I'm >> unable to replicate the issue -- I've run your test (and a C >> equivalent) a few hundred times now: > > I don't think we do anything unusual, the script we are using is > fairly simple (it does a module purge to ensure we are just using the > system compilers and don't pick up anything strange) and is as follows: > > #!/bin/bash > > BASE=`basename $PWD | sed -e s,-,/,` > > module purge > > ./configure --prefix=/usr/local/${BASE} --with-slurm --with-openib > --enable-static --enable-shared > > make -j > > > - -- > Christopher SamuelSenior Systems Administrator > VLSCI - Victorian Life Sciences Computation Initiative > Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 > http://www.vlsci.org.au/ http://twitter.com/vlsci > > -BEGIN PGP SIGNATURE- > Version: GnuPG v1.4.11 (GNU/Linux) > Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ > > iEYEARECAAYFAlImicgACgkQO2KABBYQAh83GQCcDp/TF/lCe3RnmNYq+tl6ef0D > q2AAn3BNG8omGncmLc4HadRPZgRjQEph > =56wh > -END PGP SIGNATURE- > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI devel] RFC: Remove alignment code from rcache
WHAT: Remove alignment code from ompi/mca/rcache/vma module WHY: Because it is redundant and causing problems for memory pools that want different alignment WHERE: ompi/mca/rcache/vma/rcache_vma.c, ompi/mca/mpool/grdma/mpool_grdma_module.c (Detailed changes attached) WHEN: Tuesday, September 17, 2013 COB More detail: This RFC looks to remove the alignment code from the rcache as it seems unnecessary. In all use cases in the library, alignment requirements are handled in the memory pool layer (or in the case of the vader btl, in the btl layer). It seems more logical that the alignment is in the upper layer as that code is also where any registration restrictions would be known. The rcache alignment code causes problems for me where I want to have different alignment requirements than the rcache is forcing on me. (The rcache defaults to an alignment of mca_mpool_base_page_size_log=4K on my machine) Therefore, I would like to make the change as attached to this email. I have run through some tests and all seems OK. Is there anything I am missing such that we need this code in the rcache? Thanks, Rolf [rvandevaart@sm064 ompi-trunk-tuesday]$ svn diff Index: ompi/mca/rcache/vma/rcache_vma.c === --- ompi/mca/rcache/vma/rcache_vma.c (revision 29155) +++ ompi/mca/rcache/vma/rcache_vma.c (working copy) @@ -48,15 +48,13 @@ void* addr, size_t size, mca_mpool_base_registration_t **reg) { int rc; -void* base_addr; -void* bound_addr; +unsigned char* bound_addr; if(size == 0) { return OMPI_ERROR; } -base_addr = down_align_addr(addr, mca_mpool_base_page_size_log); -bound_addr = up_align_addr((void*) ((unsigned long) addr + size - 1), mca_mpool_base_page_size_log); +bound_addr = addr + size - 1; /* Check to ensure that the cache is valid */ if (OPAL_UNLIKELY(opal_memory_changed() && @@ -65,8 +63,8 @@ return rc; } -*reg = mca_rcache_vma_tree_find((mca_rcache_vma_module_t*)rcache, (unsigned char*)base_addr, -(unsigned char*)bound_addr); +*reg = mca_rcache_vma_tree_find((mca_rcache_vma_module_t*)rcache, (unsigned char*)addr, +bound_addr); return OMPI_SUCCESS; } @@ -76,14 +74,13 @@ int reg_cnt) { int rc; -void *base_addr, *bound_addr; +unsigned char *bound_addr; if(size == 0) { return OMPI_ERROR; } -base_addr = down_align_addr(addr, mca_mpool_base_page_size_log); -bound_addr = up_align_addr((void*) ((unsigned long) addr + size - 1), mca_mpool_base_page_size_log); +bound_addr = addr + size - 1; /* Check to ensure that the cache is valid */ if (OPAL_UNLIKELY(opal_memory_changed() && @@ -93,7 +90,7 @@ } return mca_rcache_vma_tree_find_all((mca_rcache_vma_module_t*)rcache, -(unsigned char*)base_addr, (unsigned char*)bound_addr, regs, +(unsigned char*)addr, bound_addr, regs, reg_cnt); } Index: ompi/mca/mpool/grdma/mpool_grdma_module.c === --- ompi/mca/mpool/grdma/mpool_grdma_module.c (revision 29155) +++ ompi/mca/mpool/grdma/mpool_grdma_module.c(working copy) @@ -233,7 +233,7 @@ * Persistent registration are always registered and placed in the cache */ if(!(bypass_cache || persist)) { /* check to see if memory is registered */ -mpool->rcache->rcache_find(mpool->rcache, addr, size, reg); +mpool->rcache->rcache_find(mpool->rcache, base, bound - base + + 1, reg); if (*reg && !(flags & MCA_MPOOL_FLAGS_INVALID)) { if (0 == (*reg)->ref_count) { /* Leave pinned must be set for this to still be in the rcache. */ @@ -346,7 +346,7 @@ OPAL_THREAD_LOCK(&mpool->rcache->lock); -rc = mpool->rcache->rcache_find(mpool->rcache, addr, size, reg); +rc = mpool->rcache->rcache_find(mpool->rcache, base, bound - base + + 1, reg); if(NULL != *reg && (mca_mpool_grdma_component.leave_pinned || ((*reg)->flags & MCA_MPOOL_FLAGS_PERSIST) || [rvandevaart@sm064 ompi-trunk-tuesday]$ --- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. ---
[OMPI devel] [PATCH] orte: Do not call tcgetattr on pipe descriptor
The function orte_iof_base_setup_prefork attempts to create a pty for child stdout and falls back to plain pipe if openpty fails. Child uses the 'usepty' flag to decide whether to treat this descriptor as a pty or as a pipe. Set 'usepty' flag to 0 upon openpty failure to inform the child that it isn't dealing with a pty even though pty has been requested. Patch applies against svn trunk and v1.6.5, where I found this issue. Index: orte/mca/iof/base/iof_base_setup.c === --- orte/mca/iof/base/iof_base_setup.c (revision 29155) +++ orte/mca/iof/base/iof_base_setup.c (working copy) @@ -94,6 +94,7 @@ #endif if (ret < 0) { +opts->usepty = 0; if (pipe(opts->p_stdout) < 0) { ORTE_ERROR_LOG(ORTE_ERR_SYS_LIMITS_PIPES); return ORTE_ERR_SYS_LIMITS_PIPES;