[OMPI devel] MPI_REAL2 support and Fortran ddt numbering

2007-06-19 Thread Rainer Keller
Hello dear all,
with the current numbering in mpif-common.h, the optional ddt MPI_REAL2 will 
break the binary compatibility of the fortran interface from v1.2 to v1.3 
(see r15133).

Now apart from MPI_REAL2 being of let's say rather minor importance, the group 
may feal that the numbering of datatypes is crucial to the end user and the 
(once agreed upon) allowed binary incompatibility for major version number 
changes is void.

(The most important datatype that this change affects is MPI_DOUBLE_PRECISION: 
users will need to recompile their code with v1.3...)

Please raise Your hand if anybody cares.

Thanks,
Rainer
-- 

Dipl.-Inf. Rainer Keller   http://www.hlrs.de/people/keller
 High Performance Computing   Tel: ++49 (0)711-685 6 5858
   Center Stuttgart (HLRS)   Fax: ++49 (0)711-685 6 5832
 POSTAL:Nobelstrasse 19 email: kel...@hlrs.de 
 ACTUAL:Allmandring 30, R.O.030AIM:rusraink
 70550 Stuttgart


Re: [OMPI devel] openib coord teleconf

2007-06-19 Thread Andrew Friedley

oops, was this yesterday?  If it was today, whats the number?

Andrew

Jeff Squyres wrote:

On Jun 13, 2007, at 3:38 PM, Andrew Friedley wrote:

I'd like to call in as some of this applies to UD as well, is that  
okay?


Sounds good.


Andrew

Jeff Squyres wrote:

On Jun 13, 2007, at 2:41 PM, Gleb Natapov wrote:


Pasha tells me that the best times for Ishai and him are:

- 2000-2030 Israel time
- 1300-1300 US Eastern
- 1100-1130 US Mountain
- 2230-2300 India (Bangalore)

Although they could also do the preceding half hour as well.


Depends on the date. The closest I can at 20:00 is June 19.

Oops!  I left out the date -- sorry.  I meant to say Monday, June
18th.  And I got the US eastern time wrong; that should have been
noon, not 1300.

20:00 Israel June 19th is right after the weekly OMPI teleconf; want
to do it then?


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





Re: [OMPI devel] MPI_REAL2 support and Fortran ddt numbering

2007-06-19 Thread Terry D. Dontje

Rainer Keller wrote:


Hello dear all,
with the current numbering in mpif-common.h, the optional ddt  
MPI_REAL2 will
break the binary compatibility of the fortran interface from v1.2 to  
v1.3

(see r15133).

Now apart from MPI_REAL2 being of let's say rather minor importance,  
the group
may feal that the numbering of datatypes is crucial to the end user  
and the
(once agreed upon) allowed binary incompatibility for major version  
number

changes is void.

(The most important datatype that this change affects is  
MPI_DOUBLE_PRECISION:

users will need to recompile their code with v1.3...)

Please raise Your hand if anybody cares.



Sun cares very much about this for the exact reason you state (Binary 
compatibility).

I'd prefer this ddt is placed at the end of the list.

thanks,

--td



Re: [OMPI devel] MPI_REAL2 support and Fortran ddt numbering

2007-06-19 Thread Brian Barrett

On Jun 19, 2007, at 8:35 AM, Terry D. Dontje wrote:


Rainer Keller wrote:


Hello dear all,
with the current numbering in mpif-common.h, the optional ddt
MPI_REAL2 will
break the binary compatibility of the fortran interface from v1.2 to
v1.3
(see r15133).

Now apart from MPI_REAL2 being of let's say rather minor importance,
the group
may feal that the numbering of datatypes is crucial to the end user
and the
(once agreed upon) allowed binary incompatibility for major version
number
changes is void.

(The most important datatype that this change affects is
MPI_DOUBLE_PRECISION:
users will need to recompile their code with v1.3...)

Please raise Your hand if anybody cares.



Sun cares very much about this for the exact reason you state (Binary
compatibility).
I'd prefer this ddt is placed at the end of the list.


I think we should try to avoid binary compatibility changes at the  
MPI layer if we can, even between our "major" releases.  Especially  
if they don't take lots of work.  Now if only we would stop changing  
the size of ompi_communicator_t :).


Brian


Re: [OMPI devel] openib coord teleconf

2007-06-19 Thread Jeff Squyres
It's today.  If there's time left on the main OMPI call, we'll do it  
there.  Otherwise, I'll tell everyone on today's call what the number  
is for after the OMPI call.


On Jun 19, 2007, at 10:38 AM, Andrew Friedley wrote:


oops, was this yesterday?  If it was today, whats the number?

Andrew

Jeff Squyres wrote:

On Jun 13, 2007, at 3:38 PM, Andrew Friedley wrote:


I'd like to call in as some of this applies to UD as well, is that
okay?


Sounds good.


Andrew

Jeff Squyres wrote:

On Jun 13, 2007, at 2:41 PM, Gleb Natapov wrote:


Pasha tells me that the best times for Ishai and him are:

- 2000-2030 Israel time
- 1300-1300 US Eastern
- 1100-1130 US Mountain
- 2230-2300 India (Bangalore)

Although they could also do the preceding half hour as well.


Depends on the date. The closest I can at 20:00 is June 19.

Oops!  I left out the date -- sorry.  I meant to say Monday, June
18th.  And I got the US eastern time wrong; that should have been
noon, not 1300.

20:00 Israel June 19th is right after the weekly OMPI teleconf;  
want

to do it then?


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] MPI_REAL2 support and Fortran ddt numbering

2007-06-19 Thread Rainer Keller
Hello,
On Tuesday 19 June 2007 16:41, Brian Barrett wrote:
> >> Please raise Your hand if anybody cares.
> >
> > Sun cares very much about this for the exact reason you state (Binary
> > compatibility).
> > I'd prefer this ddt is placed at the end of the list.
>
> I think we should try to avoid binary compatibility changes at the
> MPI layer if we can, even between our "major" releases.  Especially
> if they don't take lots of work.  Now if only we would stop changing
> the size of ompi_communicator_t :).

Alright. As suggested, the two missing parts are committed.
To the unsuspecting Fortran app, everything should look the same...

With best regards,
Rainer
-- 

Dipl.-Inf. Rainer Keller   http://www.hlrs.de/people/keller
 High Performance Computing   Tel: ++49 (0)711-685 6 5858
   Center Stuttgart (HLRS)   Fax: ++49 (0)711-685 6 5832
 POSTAL:Nobelstrasse 19 email: kel...@hlrs.de 
 ACTUAL:Allmandring 30, R.O.030AIM:rusraink
 70550 Stuttgart


[OMPI devel] Unreliable Datagram BTL

2007-06-19 Thread Andrew Friedley
Galen asked for a writeup of where the UD BTL support is at and what 
(important) issues remain, so here it is.


Right now, to ensure MPI guaranteed delivery semantics the DR PML must 
be used with UD -- the UD BTL does not implement its own reliability. 
The best solution would be to implement a lightweight reliability 
protocol within the UD BTL, and would be most effective with a progress 
thread.


Progress threads are a whole other issue.. with a quick implementation, 
I was hitting all sorts of segfaults in the PML.  The UD BTL seems 
unique in that it is common for messages to be received and passed up to 
 the PML out of order.  I can revisit this and file some bug reports if 
desired sooner than later.


I know of one outstanding bug -- any of the tests in the intel suite 
using buffered sends fail with incorrect data.  I've shown this problem 
to George, Galen, and Brian and have yet to come up with a fix -- it 
appears to be an issue with messages arriving at the PML out of order, 
at which point the PML has no datatype information so cannot reassemble 
the messages correctly.  This would need to be fixed for 1.3.


When the UD BTL goes into the trunk, it will always de-select itself 
unless specifically requested with the MCA btl parameter (i.e. -mca btl 
ud,self).  This prevents the UD BTL from being used by default along 
with the existing RC (openib) BTL and possibly lowering performance.


Some minor issues.. when it hits the trunk, it will be called 'ofud', 
short for OpenFabrics Unreliable Datagrams.  Currently RDMA CM is not 
used, though it will not be hard to switch over (doing it at the same 
time as the openib BTL seems appropriate to me).


Andrew