Re: [OMPI devel] Using MTT to test the newly added SCTP BTL

2007-11-30 Thread Karol Mroz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi, Jeff... thanks for getting back to me.

Jeff Squyres wrote:
> On Nov 29, 2007, at 12:13 PM, Karol Mroz wrote:
> 
>>> One solution might be to remove the .ompi_ignore but to only enable
>>> the SCTP BTL when an explicit --with-sctp flag is given to configure
>>> (or something similar).  You might want to run this by the [OMPI]
>>> group first, but there's precedent for it, so I doubt anyone would
>>> object.
>> The situation at present is that the SCTP BTL only builds on FreeBSD,
>> OSX and Linux and only if the SCTP is found to be in a standard place.
>> On Linux, for instance, you need to have installed the lksctp  
>> package in
>> order for the SCTP BTL to build. We also have a --with-sctp configure
>> option where you can specify the SCTP path should it not be in a
>> standard location. If SCTP does not exist on the system, then the BTL
>> will not build and more importantly, will not break the build of the
>> overall system.
> 
> Is this SCTL lksctp package installed by default on any Linux?  OS X?   
> Solaris?

The lksctp package is not installed by default on any Linux distribution
that I'm aware of. For OSX, SCTP support is provided via the SCTP
Network Kernel Extension (http://sctp.fh-muenster.de/sctp-nke.html) and
this too is not installed by default. Solaris does have SCTP support by
default, but we currently do not build on Solaris systems regardless.

>> My question now, is it necessary for us to alter the above
>> behavior (as initially mentioned by Jeff), or is having the SCTP BTL
>> build iff SCTP is found sufficient?
> 
> 
> I think the only thing that matters is what the current default  
> behavior is -- if the .ompi_ignore is removed, will it hose anyone  
> unexpectedly?  I.e., if they build and run today and it works, then  
> the .ompi_ignore is removed and you build and run... and it doesn't  
> work.  That my only real concern.

Removal of .ompi_ignore should not create build problems for anyone who
is running without some form of SCTP support. To test this claim, we
built Open MPI with .ompi_ignore removed and no SCTP support on both an
ubuntu linux and an OSX machine. Both builds succeeded without any problem.

A couple other questions we had, and this references an email from a
while back, deals with SCTP BTL exclusivity. I will link the relevant
message below and any advice would be appreciated:
http://www.open-mpi.org/community/lists/devel/2007/11/2609.php

Thanks.

- --
Karol Mroz
km...@cs.ubc.ca


-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHUKH7uoug78g/Mz8RAmuuAKCF2FHDkfwsv4G6Pc1f05Ya9CFHLwCfQJT1
UJb17w+fhxL6abtOwLKX4nE=
=QSsm
-END PGP SIGNATURE-


[OMPI devel] Another patch for v1.2.5

2007-11-30 Thread Jeff Squyres

Inspired by this thread:

http://www.open-mpi.org/community/lists/users/2007/11/4547.php

Brian kindly donated a patch to make Linux ECONNREFUSED behavior  
better in the oob TCP.  I filed CMR 1192 to get this into 1.2.5.  It's  
not critical for 1.2.5, but it would be nice to have.


--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] Using ompi_proc_t's proc_name.vpid as Universal rank

2007-11-30 Thread Sajjad Tabib
Hi, 

Thanks for the clarification. So, now I am wondering how rank information 
regarding processes in MPI_COMM_WORLD are assigned. Is there a table that 
stores unique integer values for processess or is rank assignment done in 
some other manner? 

Thanks,

Sajjad Tabib




Tim Prins  
Sent by: devel-boun...@open-mpi.org
11/30/07 07:22 AM
Please respond to
Open MPI Developers 


To
Open MPI Developers 
cc

Subject
Re: [OMPI devel] Using ompi_proc_t's proc_name.vpid as Universal rank






Hi Sajjad,

The vpid is not unique. If you do a comm_spawn then the newly launched 
processes will have a new jobid, and their vpids will start at 0. So the 
whole process name is unique.

However, there is talk now of being able to join 2 jobs that were 
started completely independently. This may lead to the point where a 
process name is no longer unique, however this work appears to be a ways 
out and as far as I know no decisions have been made on it yet.

Hope this helps,

Tim

Sajjad Tabib wrote:
> 
> Hello,
> 
> I have a proprietary transport/messaging layer that sits below an MTL 
> component. This transport layer requires OpenMPI to assign it a rank 
> that is unique and specific to that process and will not change from 
> execution to termination. In a way, I am trying to find a one-one 
> correspondence between a process's universal rank in OpenMPI and this 
> transport layer. I began looking at ompi_proc_t from different processes 

> and seemingly found a unique identifier, proc_name.vpid. Consequently, I 

> assigned the ranks to each process in my transport layer based on the 
> value of the local vpid of each process.
> I have not tested this thoroughly, but it has been working so far. 
> Although, I would like to make sure that this is a good approach, or 
> know, at least, whether if there are other ways to do this. I would 
> appreciate it if you could leave me feedback or give suggestions on how 
> to assign universal ranks to a proprietary transport software.
> 
> Thanks for your help,
> 
> Sajjad Tabib
> 
> 
> 
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] tmp XRC branches

2007-11-30 Thread Gleb Natapov
On Fri, Nov 30, 2007 at 02:06:02PM -0500, Jeff Squyres wrote:
> Are any of the XRC tmp SVN branches still relevant?  Or have they now  
> been integrated into the trunk?
> 
> I ask because I see 4 XRC-related branches out there under /tmp and / 
> tmp-public.
They are not relevant any more. I'll remove the one I created.

--
Gleb.


Re: [OMPI devel] Using MTT to test the newly added SCTP BTL

2007-11-30 Thread Jeff Squyres

On Nov 29, 2007, at 12:13 PM, Karol Mroz wrote:


One solution might be to remove the .ompi_ignore but to only enable
the SCTP BTL when an explicit --with-sctp flag is given to configure
(or something similar).  You might want to run this by the [OMPI]
group first, but there's precedent for it, so I doubt anyone would
object.


The situation at present is that the SCTP BTL only builds on FreeBSD,
OSX and Linux and only if the SCTP is found to be in a standard place.
On Linux, for instance, you need to have installed the lksctp  
package in

order for the SCTP BTL to build. We also have a --with-sctp configure
option where you can specify the SCTP path should it not be in a
standard location. If SCTP does not exist on the system, then the BTL
will not build and more importantly, will not break the build of the
overall system.


Is this SCTL lksctp package installed by default on any Linux?  OS X?   
Solaris?



My question now, is it necessary for us to alter the above
behavior (as initially mentioned by Jeff), or is having the SCTP BTL
build iff SCTP is found sufficient?



I think the only thing that matters is what the current default  
behavior is -- if the .ompi_ignore is removed, will it hose anyone  
unexpectedly?  I.e., if they build and run today and it works, then  
the .ompi_ignore is removed and you build and run... and it doesn't  
work.  That my only real concern.


--
Jeff Squyres
Cisco Systems


Re: [OMPI devel] Indirect calls to wait* and test*

2007-11-30 Thread Josh Hursey
I would find this a useful feature. I haven't played with the diff so  
I can't comment on it, but the idea of it sounds good to me.


Cheers,
Josh

On Nov 29, 2007, at 6:37 PM, Aurelien Bouteiller wrote:

This patch introduces customisable wait/test for requests as  
discussed at the face-to-face ompi meeting in Paris.


A new global structure (ompi_request_functions) holding all the  
pointers to the wait/test functions have been added.  
ompi_request_wait* and ompi_request_test* have been #defined to be  
replaced by ompi_request_functions.req_wait. The default  
implementations of the wait/test functions names have been changed  
from ompi_request_% to ompi_request_default_%. Those functions are  
static initializer of the ompi_request_functions structure.


To modify the defaults, a components 1) copy the  
ompi_request_functions structure (the type ompi_request_fns_t can  
be used to declare a suitable variable), 2) change some of the  
functions according to its needs. This is best done at MPI_init  
time when there is no threads. Should this component be unloaded it  
have to restore the defaults. The ompi_request_default_* functions  
should never be called directly anywhere in the code. If a  
component needs to access the previously defined implementation  of  
wait, it should call its local copy of the function. Component  
implementors should keep in mind that another component might have  
already changed the defaults and needs to be called.


Performance impact on NetPipe -a (async recv mode) does not show  
measurable overhead. Here follows the "diff -y" between original  
and modified ompi assembly code from ompi/mpi/c/wait.c. The only  
significant difference is an extra movl to load the address of the  
ompi_request_functions structure in eax. This obviously explains  
why there is no measurable cost on latency.


ORIGINAL   
  MODIFIED


L2: 
L2:
	movl	L_ompi_request_null$non_lazy_ptr-"L001$pb"(%ebx), % 
eax	movl	L_ompi_request_null$non_lazy_ptr-"L001$pb"(% 
ebx), %eax

cmpl%eax, (%edi)
cmpl%eax, (%edi)
je  L18 
je  L18
   >		movl	L_ompi_request_functions 
$non_lazy_ptr-"L001$pb"(%ebx), %eax

movl%esi, 4(%esp)   
movl%esi, 4(%esp)
movl%edi, (%esp)
movl%edi, (%esp)
callL_ompi_request_wait$stub
   |call*16(%eax)

Here is the patch for those who want to try themselves.





If I receive comments outlining the need, thread safe accessors  
could be added to allow components to change the functions at  
anytime during execution and not only during MPI_Init/Finalize.  
Please make noise if you find this useful.
If comments does not suggest extra work, I expect this code to be  
committed in trunk next week.


Aurelien

Le 8 oct. 07 à 06:01, Aurelien Bouteiller a écrit :


For message logging purpose, we need to interface with wait_any,
wait_some, test, test_any, test_some, test_all. It is not possible to
use PMPI for this purpose. During the face-to-face meeting in Paris
(5-12 october 2007) we discussed this issue and came to the
conclusion that the best way to achieve this is to replace direct
calls to ompi_request_wait* and test* by indirect calls (same way as
PML send, recv, etc).

Basic idea is to declare a static structure containing the 8 pointers
to all the functions. This structure is initialized at compilation
time with the current basic wait/test functions. Before end of
MPI_init, any component might replace the basics with specialized
functions.

Expected cost is less than .01us latency according to preliminary
test. The method is consistent with the way we call pml send/recv.
Mechanism could be used later for stripping out grequest from
critical path when they are not used.

--
Aurelien Bouteiller, PhD
Innovative Computing Laboratory - MPI group
+1 865 974 6321
1122 Volunteer Boulevard
Claxton Education Building Suite 350
Knoxville, TN 37996

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Dr. Aurelien Bouteiller, Sr. Research Associate
Innovative Computing Laboratory - MPI group
+1 865 974 6321
1122 Volunteer Boulevard
Claxton Education Building Suite 350
Knoxville, TN 37996

___
devel mailing list
de...@open-mpi.org