Re: [OMPI devel] RFC: hide btl segment keys within btl

2012-06-18 Thread Nathan Hjelm
Rolf, I updated my bitbucket base on your and George's comments. Let me know if 
you find any more problems.

-Nathan

On Mon, Jun 18, 2012 at 10:18:20AM -0700, Rolf vandeVaart wrote:
> Hi Nathan:
> I downloaded and tried it out.  There were a few issues that I had to work 
> through, but finally got things working.
> Can you apply this patch to your changes prior to checking things in?
> 
> I also would suggest configuring with --enable-picky as there are something 
> like 10 warnings generated due to your changes.  And check for tabs.
> 
> Otherwise, I think it is good.
> 
> Rolf
> 
> >-Original Message-
> >From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org]
> >On Behalf Of George Bosilca
> >Sent: Saturday, June 16, 2012 12:49 PM
> >To: Open MPI Developers
> >Subject: Re: [OMPI devel] RFC: hide btl segment keys within btl
> >
> >Looks good to me. I would add some checks regarding the number and size of
> >the segments and the allocated space (MCA_BTL_SEG_MAX_SIZE) to make
> >sure we never hit the corner case where there are too many segments
> >compared with the available space. And add a huge comment in the btl.h
> >about the fact that mca_btl_base_segment_t should be used with extreme
> >care.
> >
> >  george.
> >
> >On Jun 14, 2012, at 18:42 , Jeff Squyres wrote:
> >
> >> This sounds like a good thing to me.  +1
> >>
> >> On Jun 13, 2012, at 12:58 PM, Nathan Hjelm wrote:
> >>
> >>> What: hide btl segment keys from PML/OSC code.
> >>>
> >>> Why: As it stands new BTLs with larger segment keys (smcuda for example)
> >require changes in both OSC/rdma as well as the PMLs. This RFC makes will
> >make changes in segment keys transparent to all btl users.
> >>>
> >>> When: The changes are very straight-forward so I am setting the timeout
> >for this to June 22, 2012
> >>>
> >>> Where: See the attached patch or check out the bitbucket
> >http://bitbucket.org/hjelmn/ompi-btl-interface-update
> >>>
> >>> All the relevant PMLs/BTLs + OSC/rdma have been updated with the
> >exception of btl/wv. I have also tested the following components:
> >>> - ob1
> >>> - csum
> >>> - bfo
> >>> - ugni (now works with MPI one-sides)
> >>> - sm
> >>> - vader
> >>> - openib (in progress)
> >>>
> >>> Brian and Rolf, please take a look at your components and let me know if I
> >screwed anything up.
> >>>
> >>> -Nathan Hjelm
> >>> HPC-3, LANL
> >>> ___
> >>> devel mailing list
> >>> de...@open-mpi.org
> >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >>
> >>
> >> --
> >> Jeff Squyres
> >> jsquy...@cisco.com
> >> For corporate legal information go to:
> >http://www.cisco.com/web/about/doing_business/legal/cri/
> >>
> >>
> >> ___
> >> devel mailing list
> >> de...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> >
> >
> >___
> >devel mailing list
> >de...@open-mpi.org
> >http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ---
> This email message is for the sole use of the intended recipient(s) and may 
> contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> ---


> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel



Re: [OMPI devel] RFC: hide btl segment keys within btl

2012-06-18 Thread Rolf vandeVaart
Hi Nathan:
I downloaded and tried it out.  There were a few issues that I had to work 
through, but finally got things working.
Can you apply this patch to your changes prior to checking things in?

I also would suggest configuring with --enable-picky as there are something 
like 10 warnings generated due to your changes.  And check for tabs.

Otherwise, I think it is good.

Rolf

>-Original Message-
>From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org]
>On Behalf Of George Bosilca
>Sent: Saturday, June 16, 2012 12:49 PM
>To: Open MPI Developers
>Subject: Re: [OMPI devel] RFC: hide btl segment keys within btl
>
>Looks good to me. I would add some checks regarding the number and size of
>the segments and the allocated space (MCA_BTL_SEG_MAX_SIZE) to make
>sure we never hit the corner case where there are too many segments
>compared with the available space. And add a huge comment in the btl.h
>about the fact that mca_btl_base_segment_t should be used with extreme
>care.
>
>  george.
>
>On Jun 14, 2012, at 18:42 , Jeff Squyres wrote:
>
>> This sounds like a good thing to me.  +1
>>
>> On Jun 13, 2012, at 12:58 PM, Nathan Hjelm wrote:
>>
>>> What: hide btl segment keys from PML/OSC code.
>>>
>>> Why: As it stands new BTLs with larger segment keys (smcuda for example)
>require changes in both OSC/rdma as well as the PMLs. This RFC makes will
>make changes in segment keys transparent to all btl users.
>>>
>>> When: The changes are very straight-forward so I am setting the timeout
>for this to June 22, 2012
>>>
>>> Where: See the attached patch or check out the bitbucket
>http://bitbucket.org/hjelmn/ompi-btl-interface-update
>>>
>>> All the relevant PMLs/BTLs + OSC/rdma have been updated with the
>exception of btl/wv. I have also tested the following components:
>>> - ob1
>>> - csum
>>> - bfo
>>> - ugni (now works with MPI one-sides)
>>> - sm
>>> - vader
>>> - openib (in progress)
>>>
>>> Brian and Rolf, please take a look at your components and let me know if I
>screwed anything up.
>>>
>>> -Nathan Hjelm
>>> HPC-3, LANL
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>___
>devel mailing list
>de...@open-mpi.org
>http://www.open-mpi.org/mailman/listinfo.cgi/devel

---
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
---


cuda-fixes.diff
Description: cuda-fixes.diff


Re: [OMPI devel] RFC: Pineapple Runtime Interposition Project

2012-06-18 Thread Josh Hursey
That sounds good.

The approach for this project can be seen as multi-phased.

Phase 0: Baseline
 - Baseline implementation of just the interfaces that OMPI uses.
 - This is what I want to commit to the trunk next week.

Phases 1-N: Interface enhancements
 - There are a number of suggested enhancements for the interface
going forward. From making the API a bit more general to exposing more
of ORTE through the pineapple interface.
 - Per the 'OMPI/ORTE/OPAL stack is king' discussion, the Open MPI
community needs to discuss the manipulation of the interface for
projects outside of that stack. So it seems that interface
modifications needed for projects outside of the OMPI/ORTE/OPAL stack
need to be discussed by the Open MPI community.
 - As Ralph pointed out, this will probably not be an easy discussion at times.
 - No timeline for these phases

I intentionally did not want to blur Phase 0 with Phase 1-N so that we
can get things going. It seems from past attempts at this that Phase
1-N seem to sink the conversation and good software development in
Phase 0 is lost. So I want to get Phase 0 in the trunk, then if folks
want to talk about interface tweaks we can do so as needed going
forward.

I should note that at this time I, personally, do not have any
interface items to be addressed. This interface is sufficient for what
I need right now.

I'll get back to work finishing up the branch :)

-- Josh



On Mon, Jun 18, 2012 at 9:59 AM, Ralph Castain  wrote:
> No disagreement over the approach - having the interface only cover OMPI as 
> it sits is fine. As I said at the meeting, those of us using ORTE for other 
> purposes have no real reason to need "pineapple" and can just work directly 
> with the ORTE interfaces (GP will commit to this route). Based on that plus 
> your comments, I would leave the interface alone for now.
>
> I remain unconvinced by the "put other RTEs under OMPI" argument, as you 
> know, but I won't belabor it. We'll let time show us just how real that 
> concern is. For now, as we agreed at the meeting, we'll modify the interface 
> as required to meet OMPI's needs for its integration with the ORTE trunk. We 
> know that will mean some near-term changes as we work on modex, but we can 
> adjust as needed.
>
> I don't really care about the name, but just want something usably short. I'm 
> content with the old "Ompi Runtime Services Layer" (ORSL), if you want to go 
> back to it. Not sure we need to spend cycles trying to get the name to 
> reflect the precedence agreement - the precise definition of the API is 
> likely to get argued over multiple times regardless, based on prior history, 
> so a stormy future for this proposal (regardless of name) is a reasonable 
> prediction.
>
> Thanks Josh!
> Ralph
>
> On Jun 18, 2012, at 7:06 AM, Josh Hursey wrote:
>
>> On Sat, Jun 16, 2012 at 10:32 AM, Ralph Castain  wrote:
>>> Hi Josh
>>>
>>> I had a chance to review your code this morning, and generally find it is 
>>> okay with me. I see a couple of things that appear to limit it, though they 
>>> may be intentional:
>>
>> In this pass of the pineapple interface I included only those
>> interfaces that OMPI uses without extension. So if OMPI did not use a
>> parameter or only used an interface in a single way then I simplified
>> the interface appropriately. The intention in this first pass was to
>> provide only what OMPI needs from ORTE in the interface, and nothing
>> more.
>>
>> Extending this interface to be more flexible and extensible I saw as a
>> secondary discussion that the group can have moving forward from this
>> initial baseline.
>>
>>>
>>> 1. the call to pineapple_init really needs a third flag to define the 
>>> process type. Locking the underlying orte_init to MPI seems to somewhat 
>>> defeat your goal of allowing pineapple to be used for non-MPI purposes
>>
>>
>> For OMPI, we only ever pass one type to the orte_init function. As
>> such, I eliminated the third parameter since it was unnecessary for
>> OMPI in this baseline pass.
>>
>> However, it may be useful to have for non-MPI purposes. Maybe a 'int'
>> parameter with pineapple only defining that '0' is OMPI_PROCESS, and
>> all other values are to be defined in the future. However, I am
>> hesitant to extend the interface for what we 'might' do, but rather
>> only extend the interface for what we will support going forward. But
>> if you have a good use case, then we can discuss adding it to support
>> non-MPI layers above. I just did not do so since it is not what OMPI
>> needed, so why make a more complex API in this first pass.
>>
>>>
>>> 2. the barrier and other collectives are locked to the MPI_Init and 
>>> MPI_Finalize procedures due to hardcoding of the collective id. You might 
>>> want to consider altering the API to pass a collective id down so these 
>>> functions can be used in other places.
>>
>>
>> Making this more generic would be useful. I'll see what I can do when
>> I dig back in this week. I hardcoded 

Re: [OMPI devel] RFC: Pineapple Runtime Interposition Project

2012-06-18 Thread Ralph Castain
No disagreement over the approach - having the interface only cover OMPI as it 
sits is fine. As I said at the meeting, those of us using ORTE for other 
purposes have no real reason to need "pineapple" and can just work directly 
with the ORTE interfaces (GP will commit to this route). Based on that plus 
your comments, I would leave the interface alone for now.

I remain unconvinced by the "put other RTEs under OMPI" argument, as you know, 
but I won't belabor it. We'll let time show us just how real that concern is. 
For now, as we agreed at the meeting, we'll modify the interface as required to 
meet OMPI's needs for its integration with the ORTE trunk. We know that will 
mean some near-term changes as we work on modex, but we can adjust as needed.

I don't really care about the name, but just want something usably short. I'm 
content with the old "Ompi Runtime Services Layer" (ORSL), if you want to go 
back to it. Not sure we need to spend cycles trying to get the name to reflect 
the precedence agreement - the precise definition of the API is likely to get 
argued over multiple times regardless, based on prior history, so a stormy 
future for this proposal (regardless of name) is a reasonable prediction.

Thanks Josh!
Ralph

On Jun 18, 2012, at 7:06 AM, Josh Hursey wrote:

> On Sat, Jun 16, 2012 at 10:32 AM, Ralph Castain  wrote:
>> Hi Josh
>> 
>> I had a chance to review your code this morning, and generally find it is 
>> okay with me. I see a couple of things that appear to limit it, though they 
>> may be intentional:
> 
> In this pass of the pineapple interface I included only those
> interfaces that OMPI uses without extension. So if OMPI did not use a
> parameter or only used an interface in a single way then I simplified
> the interface appropriately. The intention in this first pass was to
> provide only what OMPI needs from ORTE in the interface, and nothing
> more.
> 
> Extending this interface to be more flexible and extensible I saw as a
> secondary discussion that the group can have moving forward from this
> initial baseline.
> 
>> 
>> 1. the call to pineapple_init really needs a third flag to define the 
>> process type. Locking the underlying orte_init to MPI seems to somewhat 
>> defeat your goal of allowing pineapple to be used for non-MPI purposes
> 
> 
> For OMPI, we only ever pass one type to the orte_init function. As
> such, I eliminated the third parameter since it was unnecessary for
> OMPI in this baseline pass.
> 
> However, it may be useful to have for non-MPI purposes. Maybe a 'int'
> parameter with pineapple only defining that '0' is OMPI_PROCESS, and
> all other values are to be defined in the future. However, I am
> hesitant to extend the interface for what we 'might' do, but rather
> only extend the interface for what we will support going forward. But
> if you have a good use case, then we can discuss adding it to support
> non-MPI layers above. I just did not do so since it is not what OMPI
> needed, so why make a more complex API in this first pass.
> 
>> 
>> 2. the barrier and other collectives are locked to the MPI_Init and 
>> MPI_Finalize procedures due to hardcoding of the collective id. You might 
>> want to consider altering the API to pass a collective id down so these 
>> functions can be used in other places.
> 
> 
> Making this more generic would be useful. I'll see what I can do when
> I dig back in this week. I hardcoded them since that is exactly how
> OMPI uses them today. But having them more widely accessible would be
> better.
> 
> However, as you allude to in the next paragraph, we do not want to
> define an ultimate generalized RTE abstraction. So we must be careful
> when defining more generic interfaces that we are not hindering the
> OMPI/ORTE/OPAL stack and that we have a testable usecase for the
> interface extension. This case (collective uses) is easier, but I can
> think of other places wehre it could get more delicate.
> 
>> 
>> Finally, we have to get rid of the "pineapple" name. It seems to me that the 
>> primary purpose of this work is to allow ORTE to be used more generally, and 
>> to support multiple variants of ORTE within OMPI. So how about calling it 
>> "ORte Abstraction Layer", or ORAL? This would emphasize that we are not 
>> trying to create the ultimate generalized RTE abstraction, which I think is 
>> important for all the reasons raised at the recent meeting.
> 
> 
> Renaming 'pineapple' is the very last step. So we can discuss that
> until just before it comes into the trunk.
> 
> The primary purpose of this effort is two fold. First, to allow ORTE
> to be used more generally (something other than, or in addition to,
> OMPI above it). Secondly, to allow OMPI to be used across different
> RTEs (something other than, or in addition to, ORTE below it). We are
> not trying to create the ultimate generalized RTE abstraction layer,
> but something that serves the primary master of the interaction
> between OMPI/ORTE.
> 
> H

Re: [OMPI devel] openib Dynamic SL opensm-devel usage

2012-06-18 Thread TERRY DONTJE
Nevermind the below post.  I was wrong about opensm-devel not existing 
on OL6.2.  However I still have the issue of dependency on libosmcomp.so 
that I would like to put into ompi_check_openib.m4.  Anyone against me 
putting a dependency on libosmcomp.so for btl_openib_connect_sl.o ?


--td

On 6/18/2012 7:06 AM, TERRY DONTJE wrote:
I've ran into an issue compiling openib's Dynamic SL support  on a RH 
6.2 based system with the Oracle Studio compilers.


Turns out if I compile btl_openib_connect_sl.c with the Oracle Studio 
compilers with the "-g" option the compiler compiles some static 
inline functions in ib_types.h standalone (as opposed to ignoring the 
functions since they are not called in the btl_openib_connect_sl.c 
source).  This creates a dependency on the symbol ib_error_str in 
btl_openib_connect_sl.o .  Note this symbol is defined in libosmcomp.so.


My question is should btl_openib_connect_sl.c be linking to 
libosmcomp.so since btl_openib_connect_sl.c  is including ib_types.h 
or is there an assumption being made that btl_openib_connect_sl.c is 
just using macros/defines provided by the header and nothing requiring 
access to libosmcomp.so?


I ask this because I can make my original issue go away on RH 5.X 
systems if I link in libosmcomp.so however, this library doesn't exist 
on RH 6.2 systems without RH 5 compat headers package and doesn't have 
a 32 bit version on RH 6.2 systems at all.  The point is if I try to 
fix the libosmcomp.so dependency by doing an AC_CHECK_LIB that RH 6.X 
system will actually stop configuring in Dynamic SL.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 





___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 





Re: [OMPI devel] RFC: Pineapple Runtime Interposition Project

2012-06-18 Thread Josh Hursey
On Sat, Jun 16, 2012 at 10:32 AM, Ralph Castain  wrote:
> Hi Josh
>
> I had a chance to review your code this morning, and generally find it is 
> okay with me. I see a couple of things that appear to limit it, though they 
> may be intentional:

In this pass of the pineapple interface I included only those
interfaces that OMPI uses without extension. So if OMPI did not use a
parameter or only used an interface in a single way then I simplified
the interface appropriately. The intention in this first pass was to
provide only what OMPI needs from ORTE in the interface, and nothing
more.

Extending this interface to be more flexible and extensible I saw as a
secondary discussion that the group can have moving forward from this
initial baseline.

>
> 1. the call to pineapple_init really needs a third flag to define the process 
> type. Locking the underlying orte_init to MPI seems to somewhat defeat your 
> goal of allowing pineapple to be used for non-MPI purposes


For OMPI, we only ever pass one type to the orte_init function. As
such, I eliminated the third parameter since it was unnecessary for
OMPI in this baseline pass.

However, it may be useful to have for non-MPI purposes. Maybe a 'int'
parameter with pineapple only defining that '0' is OMPI_PROCESS, and
all other values are to be defined in the future. However, I am
hesitant to extend the interface for what we 'might' do, but rather
only extend the interface for what we will support going forward. But
if you have a good use case, then we can discuss adding it to support
non-MPI layers above. I just did not do so since it is not what OMPI
needed, so why make a more complex API in this first pass.

>
> 2. the barrier and other collectives are locked to the MPI_Init and 
> MPI_Finalize procedures due to hardcoding of the collective id. You might 
> want to consider altering the API to pass a collective id down so these 
> functions can be used in other places.


Making this more generic would be useful. I'll see what I can do when
I dig back in this week. I hardcoded them since that is exactly how
OMPI uses them today. But having them more widely accessible would be
better.

However, as you allude to in the next paragraph, we do not want to
define an ultimate generalized RTE abstraction. So we must be careful
when defining more generic interfaces that we are not hindering the
OMPI/ORTE/OPAL stack and that we have a testable usecase for the
interface extension. This case (collective uses) is easier, but I can
think of other places wehre it could get more delicate.

>
> Finally, we have to get rid of the "pineapple" name. It seems to me that the 
> primary purpose of this work is to allow ORTE to be used more generally, and 
> to support multiple variants of ORTE within OMPI. So how about calling it 
> "ORte Abstraction Layer", or ORAL? This would emphasize that we are not 
> trying to create the ultimate generalized RTE abstraction, which I think is 
> important for all the reasons raised at the recent meeting.


Renaming 'pineapple' is the very last step. So we can discuss that
until just before it comes into the trunk.

The primary purpose of this effort is two fold. First, to allow ORTE
to be used more generally (something other than, or in addition to,
OMPI above it). Secondly, to allow OMPI to be used across different
RTEs (something other than, or in addition to, ORTE below it). We are
not trying to create the ultimate generalized RTE abstraction layer,
but something that serves the primary master of the interaction
between OMPI/ORTE.

However, if we consider calling it the 'ORte Abstraction Layer' for
OMPI then we could just as easily call it the 'OMPI RTE Abstraction
Layer' for ORTE. And then we quickly get back to the who owns the
interface issue, and which project stack it serves. I think a better
name is one that ties OMPI and ORTE together - maybe... OMPI and ORTE
Synergistic Abstraction (OOSA) Layer. OMPI/ORTE/OPAL stack is king,
and having the name reflect that would be good. I am just not sure
what that would be at the moment.

-- Josh

>
> HTH
> Ralph
>
>
> On Jun 15, 2012, at 12:55 PM, Josh Hursey wrote:
>
>> What: A Runtime Interposition Project - Codename Pineapple
>>
>> Why: Define clear API and semantics for runtime requirements of the OMPI 
>> layer.
>>
>> When:
>> - F June 22, 2012 - Work completed
>> - T June 26, 2012 - Discuss on teleconf
>> - R June 28, 2012 - Commit to trunk
>>
>> Where: Trunk (development BitBucket branch below)
>>  https://bitbucket.org/jjhursey/ompi-pineapple
>>
>> Attached:
>>  PDF of slides presented on the June 12, 2012 teleconf. Note that the
>> timeline was slightly adjusted above (work completed date moved
>> ealier).
>>
>>
>> Description: Short Version
>> --
>> Define, in an 'rte.h', the interfaces and semantics that the OMPI
>> layer requires of a runtime environment. Currently this interface
>> matches the subset of ORTE functionality that is used by the OMPI
>> layer. Runt

[OMPI devel] openib Dynamic SL opensm-devel usage

2012-06-18 Thread TERRY DONTJE
I've ran into an issue compiling openib's Dynamic SL support  on a RH 
6.2 based system with the Oracle Studio compilers.


Turns out if I compile btl_openib_connect_sl.c with the Oracle Studio 
compilers with the "-g" option the compiler compiles some static inline 
functions in ib_types.h standalone (as opposed to ignoring the functions 
since they are not called in the btl_openib_connect_sl.c source).  This 
creates a dependency on the symbol ib_error_str in 
btl_openib_connect_sl.o .  Note this symbol is defined in libosmcomp.so.


My question is should btl_openib_connect_sl.c be linking to 
libosmcomp.so since btl_openib_connect_sl.c  is including ib_types.h or 
is there an assumption being made that btl_openib_connect_sl.c is just 
using macros/defines provided by the header and nothing requiring access 
to libosmcomp.so?


I ask this because I can make my original issue go away on RH 5.X 
systems if I link in libosmcomp.so however, this library doesn't exist 
on RH 6.2 systems without RH 5 compat headers package and doesn't have a 
32 bit version on RH 6.2 systems at all.  The point is if I try to fix 
the libosmcomp.so dependency by doing an AC_CHECK_LIB that RH 6.X system 
will actually stop configuring in Dynamic SL.


--
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com