[OMPI devel] Potential issue with PERUSE_COMM_MSG_MATCH_POSTED_REQ event called for unexpected matches

2007-08-22 Thread Terry D. Dontje
I thought I would run this by the group before trying to unravel the 
code and figure out how to fix the problem.  It looks to me from some 
experiementation that when a process matches an unexpected message that 
the PERUSE framework incorrectly fires a 
PERUSE_COMM_MSG_MATCH_POSTED_REQ in addition to a 
PERUSE_COMM_REQ_MATCH_UNEX event.  I believe this is wrong that the 
former event should not be fired in this case.


If the above assumption is true I think the problem arises because 
PERUSE_COMM_MSG_MATCH_POSTED_REQ event is fired in function 
mca_pml_ob1_recv_request_progress which is called by 
mca_pml_ob1_recv_request_match_specific when a match of an unexpected 
message has occurred.  I am wondering if the 
PERUSE_COMM_MSG_MATCH_POSTED_REQ event should be moved to a more posted 
queue centric routine something like mca_pml_ob1_recv_frag_match?


Suggestions...thoughts?

--td



Re: [OMPI devel] [RFC] Runtime Services Layer

2007-08-22 Thread Ralph H Castain
Just returned from vacation...sorry for delayed response

In the past, I have expressed three concerns about the RSL. I'll aggregate
them here for those who haven't seen them before - and apologize in advance
for the long note.

For those wanting it in short, the concerns (somewhat related) are:

1. What problem are we really trying to solve?
2. Who is going to maintain old RTE versions, and why?
3. Are we constraining ourselves from further improvements in startup
performance?

My bottom line recommendation: I have no philosophical issue with the RSL
concept. However, I recommend holding off until the next version of ORTE is
completed and then re-evaluating to see how valuable the RSL might be, as
that next version will include memory footprint reduction and framework
consolidation that may yield much of the RSL's value without the extra work.


Long version:

1. What problem are we really trying to solve?
If the RSL is intended to solve the Cray support problem (where the Cray OS
really just wants to see OMPI, not ORTE), then it may have some value. The
issue to date has revolved around the difficulty of maintaining the Cray
port in the face of changes to ORTE - as new frameworks are added, special
components for Cray also need to be created to provide a "do-nothing"
capability. In addition, the Cray is memory constrained, and the ORTE
library occupies considerable space while providing very little
functionality.

The degree of value provide by the RSL will therefore depend somewhat on the
efficacy of the changes in development within ORTE. Those changes will,
among other things, significantly consolidate and reduce the number of
frameworks, and reduce the memory footprint. The expectation is that the
result will require only a single CNOS component in one framework. It isn't
clear, therefore, that the RSL will provide a significant value in that
environment.

If the RSL is intended to aid in ORTE development, as hinted at in the RFC,
then I believe that is questionable. Developing ORTE in a tmp branch has
proven reasonably effective as changes to the MPI layer are largely
invisible to ORTE. Creating another layer to the system that would also have
to be maintained seems like a non-productive way of addressing any problems
in that area.

If the RSL is intended as a means of "freezing" the MPI-RTE interface, then
I believe we could better attain that objective by simply defining a set of
requirements for the RTE. As I'll note below, freezing the interface at an
API level could negatively impact other Open MPI objectives.


2. Who is going to maintain old RTE versions, and why?
It isn't clear to me why anyone would want to do this - are we seriously
proposing that we maintain support for the ORTE layer that shipped with Open
MPI 1.0?? Can someone explain why we would want to do that?

Given what I know of ORTE, it seems questionable that, for example, one
could have RSL components for both the ORTE that shipped with Open MPI 1.0
and the ORTE that is currently in the trunk without writing a great deal of
RSL code. Creating an RSL component for the ORTE intended for Open MPI 1.3
would seem like even greater work as the flow of control is very different
(see below).

I'm sure one could overcome this with considerable code in the respective
RSL components - but I have difficulty understanding the value in doing all
that coding. Can someone explain that, and can we identify the personnel
(and/or their organization) that are willing to perform that function?


3. Are we constraining ourselves from further improvements in startup
performance?
This is my biggest area of concern. The RSL has been proposed as an
API-level definition. However, the MPI-RTE interaction really is defined in
terms of a flow-of-control - although each point of interaction is
instantiated as an API, the fact is that what happens at that point is not
independent of all prior interactions.

As an example of my concern, consider what we are currently doing with ORTE.
The latest change in requirements involves the need to significantly improve
startup time, reduce memory footprint, and reduce ORTE complexity. What we
are doing to meet that requirement is to review the delineation of
responsibilities between the MPI and RTE layers. The current delineation
evolved over time, with many of the decisions made at a very early point in
the program. For example, we instituted RTE-level stage gates in the MPI
layer because, at the time they were needed, the MPI developers didn't want
to deal with them on their side (e.g., ensuring that failure of one proc
wouldn't hang the system). Given today's level of maturity in the MPI layer,
we are now planning on moving the stage gates to the MPI layer, implemented
as an "all-to-all" - this will remove several thousand lines of code from
ORTE and make it easier for the MPI layer to operate on non-ORTE
environments.

Similar efforts are underway to reduce ORTE involvement in the modex
operation and other parts of

[OMPI devel] Orted problem

2007-08-22 Thread Carlos Segura
Hi, I am having a problem with the last version of openmpi.
In some executions (1 each 100 more or less) a message is printed:
 [tegasaste:01617] [NO-NAME] ORTE_ERROR_LOG: File read failure in file
util/universe_setup_file_io.c at line 123
It seems like if it try to read the universe file and it have nothing.
If I look the file, it contains correct information. It seems like if the
file would have been created, but no filled yet when the read is executed.

The output of ompi_info command:

Open MPI: 1.2.3
   Open MPI SVN revision: r15136
Open RTE: 1.2.3
   Open RTE SVN revision: r15136
OPAL: 1.2.3
   OPAL SVN revision: r15136
  Prefix: /soft/openmpi1.2.3
 Configured architecture: i686-pc-linux-gnu
   Configured by: csegura
   Configured on: Wed Aug 22 04:25:19 WEST 2007
  Configure host: tegasaste
Built by: csegura
Built on: miÃ(c) ago 22 04:38:34 WEST 2007
  Built host: tegasaste
  C bindings: yes
C++ bindings: yes
  Fortran77 bindings: yes (all)
  Fortran90 bindings: yes
 Fortran90 bindings size: small
  C compiler: gcc
 C compiler absolute: /usr/bin/gcc
C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
  Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
  Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
 C profiling: yes
   C++ profiling: yes
 Fortran77 profiling: yes
 Fortran90 profiling: yes
  C++ exceptions: no
  Thread support: posix (mpi: no, progress: no)
  Internal debug support: no
 MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
 libltdl support: yes
   Heterogeneous support: yes
 mpirun default --prefix: no
   MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2.3)
  MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.3)
   MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.3)
   MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.3)
   MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.3)
 MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.3)
 MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.3)
   MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0)
   MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0)
MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.3)
MCA coll: self (MCA v1.0, API v1.0, Component v1.2.3)
MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.3)
MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.3)
  MCA io: romio (MCA v1.0, API v1.0, Component v1.2.3)
   MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.3)
   MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.3)
 MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.3)
 MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.3)
 MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.3)
  MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.3)
 MCA btl: openib (MCA v1.0, API v1.0.1, Component v1.2.3)
 MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.3)
 MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.3)
 MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0)
MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.3)
 MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.3)
  MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.3)
  MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.3)
  MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.3)
 MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.3)
 MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.3)
 MCA gpr: replica (MCA v1.0, API v1.0, Component v1.2.3)
 MCA iof: proxy (MCA v1.0, API v1.0, Component v1.2.3)
 MCA iof: svc (MCA v1.0, API v1.0, Component v1.2.3)
  MCA ns: proxy (MCA v1.0, API v2.0, Component v1.2.3)
  MCA ns: replica (MCA v1.0, API v2.0, Component v1.2.3)
 MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0)
 MCA ras: dash_host (MCA v1.0, API v1.3, Component v1.2.3)
 MCA ras: gridengine (MCA v1.0, API v1.3, Component v1.2.3)
 MCA ras: localhost (MCA v1.0, API v1.3, Component v1.2.3)
 MCA ras: slurm (MCA v1.0, API v1.3, Component v1.2.3)
 MCA rds: hostfile (MCA v1.0, API v1.3, Component v1.2.3)
 MCA rds: proxy (MCA v1.0, API v1.3, Component v1.2.3)
 MCA rds: resfile (MCA v1.