Re: [OMPI devel] GNU Automake 1.14 released

2013-09-10 Thread Matthias Jurenz
Hi,

On Tuesday 03 September 2013 16:01:30 Ralph Castain wrote:
> I still don't see an issue with just detecting the version of automake being
> used, and setting a conditional that indicates whether or not to use
> explicitly include the subdir. Seems like a pretty trivial solution.

Ralph, sorry, we don't understand your proposal. The warnings will be 
generated at automake time. An AM_CONDITIONAL wont help us here.

> On Sep 3, 2013, at 3:49 PM, "Jeff Squyres (jsquyres)"  
wrote:
> > On Sep 3, 2013, at 6:45 PM, Fabrício Zimmerer Murta 
 wrote:
> >> I think autotools has a concept of disallowing symlinks as it seems
> >> symlinks can't be done in a portable way, and the goal of autotools is
> >> making projects portable.
> >> 
> >> Well, if the autotools user feels like using symlinks, then it must be
> >> expected to break portability wherever you take your autoconfiscated
> >> code to. A choice to the user. Maybe in the case, as the project is
> >> bound to specific compilers, it would not be a problem to loose
> >> portability a bit more by considering symbolic linking around.> 
> > Fair enough.
> > 
> > We've been using sym links in the OMPI project for years in order to
> > compile a series of .c files in 2 different ways.  It's portable to all
> > the places that we need/want it.

Jeff, I think you mean the $(LN_S) loops for the PMPI interface. We will have a 
look into this. Thanks.

- Bert

> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
-- 
Matthias Jurenz

Technische Universität Dresden
Center for Information Services and High Performance Computing (ZIH)
01062 Dresden, Germany
Phone: +49 (351) 463-31945
Fax: +49 (351) 463-37773
E-Mail: matthias.jur...@tu-dresden.de

[OMPI devel] Inconsistent description of btl_openib_eager_rdma_num parameter in FAQ

2013-09-10 Thread Lars Schäfers
Hi,

for quite a long time already, there is a confusing inconsistency of the
description of the OpenIB parameters related to eager messages on this
FAQ site (see details below):
http://www.open-mpi.org/faq/?category=openfabrics

Does somebody here has the necessary permissions to fix this?


The answer to question 20 says (last row in the table): 

"Each MPI process will use RDMA buffers for eager fragments up to
btl_openib_eager_rdma_num MPI peers. Upon receiving the
btl_openib_eager_rdma_threshhold'th message from an MPI peer process, if
both sides have not yet setup btl_openib_eager_rdma_num sets of eager
RDMA buffers, a new set will be created. The set will contain
btl_openib_max_eager_rdma buffers; each buffer will be
btl_openib_eager_limit bytes (i.e., the maximum size of an eager
fragment)."


while part of the answer of question 24 says the following:

  * btl_openib_max_eager_rdma (default value: 16): This parameter
controls the maximum number of peers that can receive and RDMA
connection for short messages. It is not advisable to change this
value to a very large number because the polling time increase with
the number of the connections; as a direct result, short message
latency will increase.

  * btl_openib_eager_rdma_num (default value: 16): This parameter
controls the maximum number of pre-allocated buffers allocated to
each peer for small messages.  


- Lars


-- 
Lars Schaefers
Computer Engineering Group of Prof. Dr. Marco Platzner
Paderborn Center for Parallel Computing, University of Paderborn
Pohlweg 47-49, 33098 Paderborn, Germany
Tel: +49 (0)5251 60 4341, Fax: +49 (0)5251 60 5377
Office: Building O 3.119






Re: [OMPI devel] Possible OMPI 1.6.5 bug? SEGV in malloc.c

2013-09-10 Thread Jeff Squyres (jsquyres)
Hmm.  I don't know how to proceed here.  I don't doubt that this is happening 
to you, but I'm unable to reproduce it.  :-\

Can you install a segv handler to simply write(0,...) and sleep() so that you 
can attach a debugger to a live process when this happens, and poke around a 
bit?  You might get more information from a live process than a corefile.  For 
example, "remainder" comes form chunk_at_offset(p, nb), so it might be 
interesting to look at that routine and see if something is going wrong in 
there...?



On Sep 4, 2013, at 3:15 AM, Christopher Samuel  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> On 04/09/13 04:47, Jeff Squyres (jsquyres) wrote:
> 
>> Hmm.  Are you building Open MPI in a special way?  I ask because I'm
>> unable to replicate the issue -- I've run your test (and a C
>> equivalent) a few hundred times now:
> 
> I don't think we do anything unusual, the script we are using is
> fairly simple (it does a module purge to ensure we are just using the
> system compilers and don't pick up anything strange) and is as follows:
> 
> #!/bin/bash
> 
> BASE=`basename $PWD | sed -e s,-,/,`
> 
> module purge
> 
> ./configure --prefix=/usr/local/${BASE} --with-slurm --with-openib 
> --enable-static  --enable-shared
> 
> make -j
> 
> 
> - -- 
> Christopher SamuelSenior Systems Administrator
> VLSCI - Victorian Life Sciences Computation Initiative
> Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
> http://www.vlsci.org.au/  http://twitter.com/vlsci
> 
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v1.4.11 (GNU/Linux)
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> 
> iEYEARECAAYFAlImicgACgkQO2KABBYQAh83GQCcDp/TF/lCe3RnmNYq+tl6ef0D
> q2AAn3BNG8omGncmLc4HadRPZgRjQEph
> =56wh
> -END PGP SIGNATURE-
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI devel] RFC: Remove alignment code from rcache

2013-09-10 Thread Rolf vandeVaart
WHAT: Remove alignment code from ompi/mca/rcache/vma module
WHY: Because it is redundant and causing problems for memory pools that want 
different alignment
WHERE: ompi/mca/rcache/vma/rcache_vma.c, 
ompi/mca/mpool/grdma/mpool_grdma_module.c (Detailed changes attached)
WHEN: Tuesday,  September 17, 2013 COB
More detail:
This RFC looks to remove the alignment code from the rcache as it seems 
unnecessary.  In all use cases in the library, alignment requirements are 
handled in the memory pool layer (or in the case of the vader btl, in the btl 
layer).  It seems more logical that the alignment is in the upper layer as that 
code is also where any registration restrictions would be known.  The rcache 
alignment code causes problems for me where I want to have different alignment 
requirements than the rcache is forcing on me.  (The rcache defaults to an 
alignment of mca_mpool_base_page_size_log=4K on my machine)  Therefore, I would 
like to make the change as attached to this email.

I have run through some tests and all seems OK.  Is there anything I am missing 
such that we need this  code in the rcache?

Thanks,
Rolf

[rvandevaart@sm064 ompi-trunk-tuesday]$ svn diff
Index: ompi/mca/rcache/vma/rcache_vma.c
===
--- ompi/mca/rcache/vma/rcache_vma.c (revision 29155)
+++ ompi/mca/rcache/vma/rcache_vma.c  (working copy)
@@ -48,15 +48,13 @@
 void* addr, size_t size, mca_mpool_base_registration_t **reg)  {
 int rc;
-void* base_addr;
-void* bound_addr;
+unsigned char* bound_addr;
 if(size == 0) {
 return OMPI_ERROR;
 }
-base_addr = down_align_addr(addr, mca_mpool_base_page_size_log);
-bound_addr = up_align_addr((void*) ((unsigned long) addr + size - 1), 
mca_mpool_base_page_size_log);
+bound_addr = addr + size - 1;

 /* Check to ensure that the cache is valid */
 if (OPAL_UNLIKELY(opal_memory_changed() && @@ -65,8 +63,8 @@
 return rc;
 }
-*reg = mca_rcache_vma_tree_find((mca_rcache_vma_module_t*)rcache, 
(unsigned char*)base_addr,
-(unsigned char*)bound_addr);
+*reg = mca_rcache_vma_tree_find((mca_rcache_vma_module_t*)rcache, 
(unsigned char*)addr,
+bound_addr);
 return OMPI_SUCCESS;
}
@@ -76,14 +74,13 @@
 int reg_cnt)
{
 int rc;
-void *base_addr, *bound_addr;
+unsigned char *bound_addr;
 if(size == 0) {
 return OMPI_ERROR;
 }
-base_addr = down_align_addr(addr, mca_mpool_base_page_size_log);
-bound_addr = up_align_addr((void*) ((unsigned long) addr + size - 1), 
mca_mpool_base_page_size_log);
+bound_addr = addr + size - 1;
 /* Check to ensure that the cache is valid */
 if (OPAL_UNLIKELY(opal_memory_changed() && @@ -93,7 +90,7 @@
 }
 return mca_rcache_vma_tree_find_all((mca_rcache_vma_module_t*)rcache,
-(unsigned char*)base_addr, (unsigned char*)bound_addr, regs,
+(unsigned char*)addr, bound_addr, regs,
 reg_cnt);
}
Index: ompi/mca/mpool/grdma/mpool_grdma_module.c
===
--- ompi/mca/mpool/grdma/mpool_grdma_module.c   (revision 29155)
+++ ompi/mca/mpool/grdma/mpool_grdma_module.c(working copy)
@@ -233,7 +233,7 @@
  * Persistent registration are always registered and placed in the cache */
 if(!(bypass_cache || persist)) {
 /* check to see if memory is registered */
-mpool->rcache->rcache_find(mpool->rcache, addr, size, reg);
+mpool->rcache->rcache_find(mpool->rcache, base, bound - base +
+ 1, reg);
 if (*reg && !(flags & MCA_MPOOL_FLAGS_INVALID)) {
 if (0 == (*reg)->ref_count) {
 /* Leave pinned must be set for this to still be in the 
rcache. */ @@ -346,7 +346,7 @@
 OPAL_THREAD_LOCK(&mpool->rcache->lock);
-rc = mpool->rcache->rcache_find(mpool->rcache, addr, size, reg);
+rc = mpool->rcache->rcache_find(mpool->rcache, base, bound - base +
+ 1, reg);
 if(NULL != *reg &&
 (mca_mpool_grdma_component.leave_pinned ||
  ((*reg)->flags & MCA_MPOOL_FLAGS_PERSIST) ||
[rvandevaart@sm064 ompi-trunk-tuesday]$


---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


[OMPI devel] [PATCH] orte: Do not call tcgetattr on pipe descriptor

2013-09-10 Thread Michał Pecio
The function orte_iof_base_setup_prefork attempts to create a pty for
child stdout and falls back to plain pipe if openpty fails. Child uses
the 'usepty' flag to decide whether to treat this descriptor as a pty
or as a pipe.
Set 'usepty' flag to 0 upon openpty failure to inform the child that
it isn't dealing with a pty even though pty has been requested.


Patch applies against svn trunk and v1.6.5, where I found this issue.


Index: orte/mca/iof/base/iof_base_setup.c
===
--- orte/mca/iof/base/iof_base_setup.c  (revision 29155)
+++ orte/mca/iof/base/iof_base_setup.c  (working copy)
@@ -94,6 +94,7 @@
 #endif

 if (ret < 0) {
+opts->usepty = 0;
 if (pipe(opts->p_stdout) < 0) {
 ORTE_ERROR_LOG(ORTE_ERR_SYS_LIMITS_PIPES);
 return ORTE_ERR_SYS_LIMITS_PIPES;