[OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Dirk Eddelbuettel

Hi folks,

The 'new' Open MPI packages for Debian that are maintained by a few of us via
a group on alioth.debian.org have now [1] reached the main distribution. This
means they are being built on all supported architecture and logs accumulate at
http://buildd.debian.org/build.php?pkg=openmpi

This shows success on 'alpha', 'amd64', 'ia64', 'powerpc' and 'x86' (implied,
my build architecture).  However, we have failures on 'hppa', 'mips' and
'mipsel'. The remaining ones ('arm', 'm68k', 's390' and 'sparc') are still
outstanding. 

Looking at the most recent error logs on the failing architectures we see

 i)  that hppa (in the way configure sees it) is not supported:

checking if .size is needed... yes
checking if .align directive takes logarithmic value... no
configure: error: No atomic primitives available for hppa-unknown-linux-gnu
make: *** [config.status] Error 1

Now, configure and aclocal have lots of 'hppa*64' statements.  Is it
enough to turn these into  'hppa*64*|hppa*linux*' or something similar ?

This issue has previously been logged in the Debian Bug Tracking System,
see http://bugs.debian.org/431631


 ii) that mips croaks at the assembler level

ln -s "../../opal/asm/generated/atomic-local.s" atomic-asm.s
/bin/sh ../../libtool --mode=compile gcc  -DNDEBUG -Wall -g -O2 
-finline-functions -fno-strict-aliasing -c -o atomic-asm.lo atomic-asm.s
libtool: compile:  gcc -DNDEBUG -Wall -g -O2 -finline-functions 
-fno-strict-aliasing -c atomic-asm.s  -fPIC -DPIC -o .libs/atomic-asm.o
atomic-asm.s: Assembler messages:

I haven't looked in detail, but is there a non-assembler code branch we
could invoke?


 iii) mipsel is also not supported:

checking if .size is needed... yes
checking if .align directive takes logarithmic value... yes
configure: error: No atomic primitives available for mipsel-unknown-linux-gnu
make: *** [config.status] Error 1

What can we do here?  Can the Debian porters help with tests to devise
a mips/mipsel configuration?

Looking at the bug archive for 'openmpi' we see more failures:

iv)  s390 has an open bug about the same 'atomic primitives' issue, see 
 http://bugs.debian.org/376833

v)   m68k has an open bug about the same 'atomic primitives' issue, see
 http://bugs.debian.org/405929


It is possible to just declare a lists of architectures on which to build,
but this is somewhat strongly discouraged.

Please let us (ie Debian's openmpi maintainers) how else we can help.  I am
ccing the porters lists (for hppa, m68k, mips) too to invite them to help. I
hope that doesn't get the spam filters going...  I may contact the 'arm'
porters once we have a failure; s390 and sparc activity are not as big these
days. 

Regards, Dirk



[1] New packages go into the NEW queue so that the ftpfaster can inspect the
packaging, licenses, ... and reorganised source packages with new or renmamed
binary packages get the same treatment.

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison


Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Brian Barrett

On Jul 14, 2007, at 8:26 AM, Dirk Eddelbuettel wrote:

Please let us (ie Debian's openmpi maintainers) how else we can  
help.  I am
ccing the porters lists (for hppa, m68k, mips) too to invite them  
to help. I
hope that doesn't get the spam filters going...  I may contact the  
'arm'
porters once we have a failure; s390 and sparc activity are not as  
big these

days.


Open MPI uses some assembly for things like atomic locks, atomic  
compare and swap, memory barriers, and the like.  We currently have  
support for:


  * x86 (32 bit)
  * x86_64 / amd64 (32 or 64 bit)
  * UltraSparc (v8plus and v9* targets)
  * IA64
  * PowerPC (32 or 64 bit)

We also have code for:

  * Alpha
  * MIPS (32 bit NEW ABI & 64 bit)

This support isn't well tested in a while and it sounds like it  
doesn't work for MIPS.  At one time, we supported the sparc v8  
target, but that The other platforms (hppa, mipsel (how is this  
different than MIPS?), s390, m68k) aren't at all supported by Open  
MPI.  If you can get the real error messages, I can help on the MIPS  
issue, although it'll have to be a low priority.


We don't currently have support for a non-assembly code path.  We  
originally planned on having one, but the team went away from that  
route over time and there's no way to build Open MPI without assembly  
support right now.



Brian

--
  Brian W. Barrett
  Networking Team, CCS-1
  Los Alamos National Laboratory




Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Dirk Eddelbuettel

Hi Carlos,

Thanks for the quick reply.

On 14 July 2007 at 11:03, Carlos O'Donell wrote:
| On 7/14/07, Dirk Eddelbuettel  wrote:
| >  i)  that hppa (in the way configure sees it) is not supported:
| >
| > checking if .size is needed... yes
| > checking if .align directive takes logarithmic value... no
| > configure: error: No atomic primitives available for hppa-unknown-linux-gnu
| > make: *** [config.status] Error 1
| >
| > Now, configure and aclocal have lots of 'hppa*64' statements.  Is it
| > enough to turn these into  'hppa*64*|hppa*linux*' or something similar ?
| >
| > This issue has previously been logged in the Debian Bug Tracking System,
| > see http://bugs.debian.org/431631
| 
| That bug does not appear to have any relevance to the failed configure check.
| What atomic primitives does Open MPI need?

I am confused. I am not sure I understand your question.  Are you aware that
configure checks for this? Eg from my x86 build logs:

checking for pre-built assembly file... yes (atomic-ia32-linux.s)
checking for atomic assembly filename... atomic-ia32-linux.s

So atomic-$foo better be there for a given architecture foo as there are
sources for some platforms:

edd@basebud:~/src/debian/SVN/tarballs/openmpi-1.2.3/opal> ls -1 asm/generated/
atomic-alpha-linux.s
atomic-amd64-linux-nongas.s
atomic-amd64-linux.s
atomic-ia32-cygwin-nongas.s
atomic-ia32-cygwin.s
atomic-ia32-linux-nongas.s
atomic-ia32-linux.s
atomic-ia32-osx.s
atomic-ia64-linux-nongas.s
atomic-ia64-linux.s
atomic-mips-irix.s
atomic-powerpc32-64-osx.s
atomic-powerpc32-aix.s
atomic-powerpc32-linux-nongas.s
atomic-powerpc32-linux.s
atomic-powerpc32-osx.s
atomic-powerpc64-aix.s
atomic-powerpc64-linux-nongas.s
atomic-powerpc64-linux.s
atomic-powerpc64-osx.s
atomic-sparc-solaris.s
atomic-sparcv9-32-solaris.s
atomic-sparcv9-64-solaris.s

Methinks we need to fill in a few blanks here, or make do with non-asm
solutions. I don't know the problem space that well (being a maintainer
rather than upstream developer) and am looking for guidance.  

For what it's worth, lam (7.1.2, currently) us available on all build
architectures for Debian, but it may not push the (hardware) envelope as
hard. 

Hope this helps, Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison


Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Brian Barrett

On Jul 14, 2007, at 10:53 AM, Dirk Eddelbuettel wrote:


Methinks we need to fill in a few blanks here, or make do with non-asm
solutions. I don't know the problem space that well (being a  
maintainer

rather than upstream developer) and am looking for guidance.


Either way is an option.  There are really only a couple of functions  
that have to be implemented:


  * atomic word-size compare and swap
  * memory barrier

We'll emulte atomic adds and spin-locks with compare and swap if not  
directly implemented.  The memory barrier functions have to exist,  
even if they don't do anything.  We require compare-and-swap for a  
couple of pieces of code, which is why we lost our Sparc v8 support a  
couple of releases ago.



For what it's worth, lam (7.1.2, currently) us available on all build
architectures for Debian, but it may not push the (hardware)  
envelope as

hard.


Correct, LAM only had very limited ASM requirements (basically,  
memory barrier on platforms that required it -- like PowerPC).


Brian


Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Dirk Eddelbuettel

Hi Brian,

On 14 July 2007 at 10:47, Brian Barrett wrote:
| On Jul 14, 2007, at 8:26 AM, Dirk Eddelbuettel wrote:
| 
| > Please let us (ie Debian's openmpi maintainers) how else we can  
| > help.  I am
| > ccing the porters lists (for hppa, m68k, mips) too to invite them  
| > to help. I
| > hope that doesn't get the spam filters going...  I may contact the  
| > 'arm'
| > porters once we have a failure; s390 and sparc activity are not as  
| > big these
| > days.
| 
| Open MPI uses some assembly for things like atomic locks, atomic  
| compare and swap, memory barriers, and the like.  We currently have  
| support for:
| 
|* x86 (32 bit)
|* x86_64 / amd64 (32 or 64 bit)
|* UltraSparc (v8plus and v9* targets)
|* IA64
|* PowerPC (32 or 64 bit)
| 
| We also have code for:
| 
|* Alpha
|* MIPS (32 bit NEW ABI & 64 bit)
| 
| This support isn't well tested in a while and it sounds like it  

We'd be glad to help. This has worked well for other project. I think that
Debian is the quasi-official testbed for xfree.org given all our platforms.

So we can definitely try to get Alpha, Mips, ... up to speed with suitable
regression tests.

| doesn't work for MIPS.  At one time, we supported the sparc v8  
| target, but that The other platforms (hppa, mipsel (how is this  
| different than MIPS?), s390, m68k) aren't at all supported by Open  
| MPI.  If you can get the real error messages, I can help on the MIPS  
| issue, although it'll have to be a low priority.

I think mipsel is the lower-endian variant. Something similar now exists for
arm where there's also armel.  

Mips support would be nice as there are some HPC platform based on these
chips.  Maybe someone from the debian-mips team can speak up and take a lead
here to work with with you.

| We don't currently have support for a non-assembly code path.  We  
| originally planned on having one, but the team went away from that  
| route over time and there's no way to build Open MPI without assembly  
| support right now.

Personally, I think that's a fair call given what Open MPI sets out to do.
Debian 'at large' aims for the 'everything ought to build everywhere' model
(which has its merits too) so I'll have to see if we get pushback if we
restric the platforms.  

So given the list of current failures, hppa and mips/mipsel are the most
likely candidates for improvement.  Sparc and s390 are fairly dead at Debian
so not sure if anything will change there.  m68k is close to officially dead
but a few vocal enthusiast try to keep it on life-support.

Cheers, Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison


Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread George Bosilca
Instead of failing at configure time, we might want to disable the  
threading features and the shared memory device if we detect that we  
don't have support for atomics on a specified platform. In a non  
threaded build, the shared memory device is the only place where we  
need support for memory barrier. I'll look in the code to see why we  
need support for compare-and-swap on a non threaded build.


  Thanks,
george.

On Jul 14, 2007, at 1:06 PM, Brian Barrett wrote:


On Jul 14, 2007, at 10:53 AM, Dirk Eddelbuettel wrote:

Methinks we need to fill in a few blanks here, or make do with non- 
asm

solutions. I don't know the problem space that well (being a
maintainer
rather than upstream developer) and am looking for guidance.


Either way is an option.  There are really only a couple of functions
that have to be implemented:

   * atomic word-size compare and swap
   * memory barrier

We'll emulte atomic adds and spin-locks with compare and swap if not
directly implemented.  The memory barrier functions have to exist,
even if they don't do anything.  We require compare-and-swap for a
couple of pieces of code, which is why we lost our Sparc v8 support a
couple of releases ago.


For what it's worth, lam (7.1.2, currently) us available on all build
architectures for Debian, but it may not push the (hardware)
envelope as
hard.


Correct, LAM only had very limited ASM requirements (basically,
memory barrier on platforms that required it -- like PowerPC).

Brian
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Brian Barrett

On Jul 14, 2007, at 11:16 AM, George Bosilca wrote:


Instead of failing at configure time, we might want to disable the
threading features and the shared memory device if we detect that we
don't have support for atomics on a specified platform. In a non
threaded build, the shared memory device is the only place where we
need support for memory barrier. I'll look in the code to see why we
need support for compare-and-swap on a non threaded build.


George -

Disabling SM and threads if there's no atomic support would  
definitely be one option.  The compare-and-swap is used by the LIFO  
used for ompi free lists.


Brian


Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Gleb Natapov
On Sat, Jul 14, 2007 at 01:16:42PM -0400, George Bosilca wrote:
> Instead of failing at configure time, we might want to disable the  
> threading features and the shared memory device if we detect that we  
> don't have support for atomics on a specified platform. In a non  
> threaded build, the shared memory device is the only place where we  
> need support for memory barrier. I'll look in the code to see why we  
> need support for compare-and-swap on a non threaded build.
Proper memory barrier is also needed for openib BTL eager RDMA support.

> 
>Thanks,
>  george.
> 
> On Jul 14, 2007, at 1:06 PM, Brian Barrett wrote:
> 
> > On Jul 14, 2007, at 10:53 AM, Dirk Eddelbuettel wrote:
> >
> >> Methinks we need to fill in a few blanks here, or make do with non- 
> >> asm
> >> solutions. I don't know the problem space that well (being a
> >> maintainer
> >> rather than upstream developer) and am looking for guidance.
> >
> > Either way is an option.  There are really only a couple of functions
> > that have to be implemented:
> >
> >* atomic word-size compare and swap
> >* memory barrier
> >
> > We'll emulte atomic adds and spin-locks with compare and swap if not
> > directly implemented.  The memory barrier functions have to exist,
> > even if they don't do anything.  We require compare-and-swap for a
> > couple of pieces of code, which is why we lost our Sparc v8 support a
> > couple of releases ago.
> >
> >> For what it's worth, lam (7.1.2, currently) us available on all build
> >> architectures for Debian, but it may not push the (hardware)
> >> envelope as
> >> hard.
> >
> > Correct, LAM only had very limited ASM requirements (basically,
> > memory barrier on platforms that required it -- like PowerPC).
> >
> > Brian
> > ___
> > devel mailing list
> > de...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

--
Gleb.


Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Brian Barrett

On Jul 14, 2007, at 11:51 AM, Gleb Natapov wrote:


On Sat, Jul 14, 2007 at 01:16:42PM -0400, George Bosilca wrote:

Instead of failing at configure time, we might want to disable the
threading features and the shared memory device if we detect that we
don't have support for atomics on a specified platform. In a non
threaded build, the shared memory device is the only place where we
need support for memory barrier. I'll look in the code to see why we
need support for compare-and-swap on a non threaded build.
Proper memory barrier is also needed for openib BTL eager RDMA  
support.


Removed all the platform lists, since they won't care about this  
part :).


Ah, true.  The eager RDMA code should check that the preprocessor  
symbol OPAL_HAVE_ATOMIC_MEM_BARRIER is 1 and disable itself if that  
isn't the case.  All the "sections" of ASM support (memory barriers,  
locks, compare-and-swap, and atomic math) have preprocessor symbols  
indicating whether support exists or not in the current build.  These  
should really be used :).


Brian


Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread George Bosilca

Brian,

We should be able to use these defines in the configure.m4 files for  
each component right ? I think the asm section is detected before we  
go in the component configuration.


So far we know about the following components that have to disable  
themselves if no atomic or memory barrier is detected:

 - MPOOL: sm
 - BTL: sm, openib (completely or partially?)

Anybody knows about any other components with atomic requirements ?

  george.

On Jul 14, 2007, at 1:59 PM, Brian Barrett wrote:


On Jul 14, 2007, at 11:51 AM, Gleb Natapov wrote:


On Sat, Jul 14, 2007 at 01:16:42PM -0400, George Bosilca wrote:

Instead of failing at configure time, we might want to disable the
threading features and the shared memory device if we detect that we
don't have support for atomics on a specified platform. In a non
threaded build, the shared memory device is the only place where we
need support for memory barrier. I'll look in the code to see why we
need support for compare-and-swap on a non threaded build.

Proper memory barrier is also needed for openib BTL eager RDMA
support.


Removed all the platform lists, since they won't care about this
part :).

Ah, true.  The eager RDMA code should check that the preprocessor
symbol OPAL_HAVE_ATOMIC_MEM_BARRIER is 1 and disable itself if that
isn't the case.  All the "sections" of ASM support (memory barriers,
locks, compare-and-swap, and atomic math) have preprocessor symbols
indicating whether support exists or not in the current build.  These
should really be used :).

Brian
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread George Bosilca
If the OMPI_HAVE_THREAD_SUPPORT is not set the LIFO fall back to a  
default version where atomic operations are not required. We can even  
remove the dependency on the atomic.h header if the thread support is  
not enabled.


Unfortunately, our shared memory device require the atomic operations  
plus the memory barriers. Therefore, we cannot do anything more fine  
grained (such as non-existence of atomic compare-and-swap disable  
only the threading support and the non-existence of the memory  
barrier disable only the shared memory support).


  george.

On Jul 14, 2007, at 1:27 PM, Brian Barrett wrote:


On Jul 14, 2007, at 11:16 AM, George Bosilca wrote:


Instead of failing at configure time, we might want to disable the
threading features and the shared memory device if we detect that we
don't have support for atomics on a specified platform. In a non
threaded build, the shared memory device is the only place where we
need support for memory barrier. I'll look in the code to see why we
need support for compare-and-swap on a non threaded build.


George -

Disabling SM and threads if there's no atomic support would
definitely be one option.  The compare-and-swap is used by the LIFO
used for ompi free lists.

Brian
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




[OMPI devel] lsf support / farm use models

2007-07-14 Thread Matthew Moskewicz

hi everyone,

firstly, i'm new around here, and somewhat clueless when it comes to the
details of working with an big autoconfiscated project like
open-rte/open-mpi the svn checkout level ...

i've read some of the archives that turned up in searches for terms like
'LSF', and it would seem there was some discussion about adding some form of
LSF support to open-rte, but that the discussion ended a while back. so,
after  playing around with the 1.2.3 release tarball for a while, and
reading  various pieces of the code until i  had a (vague) idea of the
top-level control flow and such, i decided i was ready to try to add ras and
pls component to support LSF. once i had the build system up, i tried to
create an ras/lsf directory, and slightly to my surprise, it already
existed. i was kinda hoping for that, but it appears to be *very* fresh code
at the moment. nonetheless, i played around a bit more, and ran into two
issues:

1) it appears that you (jeff, i guess ;) are using new LSF 7.0 API features.
i'm working to support customers in the EDA space, and it's not clear
if/when they will migrate to 7.0 -- not to mention that our company
(cadence) doesn't appear to have LSF 7.0 yet. i'm still looking in to the
deatils, but it appears that (from the Platform docs) lsb_getalloc is
probably just a thin wrapper around the LSB_MCPU_HOSTS (spelling?)
environment variable. so that could be worked around fairly easily. i dunno
about lsb_launch -- it seems equivalent to a set of ls_rtask() calls (one
per process). however, i have heard that there can be significant subtleties
with the semantics of these functions, in terms of compatibility across
differently configured LSF-controlled farms, specifically with regrads to
administrators ability to track and control job execution. personally, i
don't see how it's really possible for LSF to prevent 'bad' users from
spamming out jobs or short-cutting queues, but perhaps some of the methods
they attempt to use can complicate things for a library like open-rte.

2) this brings us to point 2 -- upon talking to the author(s) of cadence's
internal open-rte-like library, several key issues were raised. mainly,
customers want their applications to be 'farm-friendly' in several key ways.
firstly, they do not want any persistent daemons running outside of a given
job -- this requirement seems met by the current open-mpi default behavior,
at least as far i can tell. secondly, they prefer (strongly) that
applications acquire resources incrementally, and perform work with whatever
nodes are currently available, rather than forcing a large up-front node
allocation. fault tolerance is nice too, although it's unclear to me if it's
really practically needed. in any case, many of our applications can
structure their computation to use resources in just such a way, generally
by dividing the work into independent, restartable pieces (i.e. they are
embarrassingly ||). also, MPI communication + MPI-2 process creation seems
to be a reasonable interface for handling communication and dynamic process
creation on the application side. however, it's not clear that open-rte
supports the needed dynamic resource acquisition model in any of the ras/pls
components i looked at. in fact, other that just folding everything in the
pls component, it's not clear that the entire flow via the rmgr really
supports it very well. specifically for LSF, the use model is that the
initial job either is created with bsub/lsb_submit(),  (or automatically
submits itself as step zero perhaps) to run initially on N machines. N
should be 'small' (1-16) -- perhaps only 1 for simplicity. then, as the
application runs, it will continue to consume more resources as limited by
the farm status, the user selection, and the max # of processes that the job
can usefully support (generally 'large' -- 100-1000 cpus).

so, i figure it's up to me to implement this stuff ;) ... clearly, i want to
keep the 'normal' style ras/pls for LSF working, but somehow add the dynamic
behavior as an option. my initial thought was to (in the dynamic case)
basically ignore/fudge the ras/rmaps(/pls?) stages and simply use
bsub/lsb_submit() in pls to launch new daemons as needed/requested.  again,
though it's not clear that the current control flow supports this well.
given that there may be a large (10sec - 15min) delay between lsb_submit()
and job launch, it may be necessary to both acquire minimum size blocks of
new daemons at a time, and to have some non-blocking way to perform
spawning. for example, in the current code, the MPI-2 spawn is blocking
because it needs to return a communicator to the spawned process. however,
this is not really necessary for the application to continue -- it can
continue with other work until the new worker is up and running. perhaps
some form of multi-threading could help with this, but it's not totally
clear. i think i would prefer some lower-level open-rte calls that perform
daemon pre-allocation (i.e. dynamic ras/daemon 

Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Brian Barrett
the availability of functionality is set by the header files for each  
platform, not by configure.  So we'd have to play some games to get  
at the information, but it should be possible.


Brian

On Jul 14, 2007, at 12:41 PM, George Bosilca wrote:


Brian,

We should be able to use these defines in the configure.m4 files for
each component right ? I think the asm section is detected before we
go in the component configuration.

So far we know about the following components that have to disable
themselves if no atomic or memory barrier is detected:
  - MPOOL: sm
  - BTL: sm, openib (completely or partially?)

Anybody knows about any other components with atomic requirements ?

   george.

On Jul 14, 2007, at 1:59 PM, Brian Barrett wrote:


On Jul 14, 2007, at 11:51 AM, Gleb Natapov wrote:


On Sat, Jul 14, 2007 at 01:16:42PM -0400, George Bosilca wrote:

Instead of failing at configure time, we might want to disable the
threading features and the shared memory device if we detect  
that we

don't have support for atomics on a specified platform. In a non
threaded build, the shared memory device is the only place where we
need support for memory barrier. I'll look in the code to see  
why we

need support for compare-and-swap on a non threaded build.

Proper memory barrier is also needed for openib BTL eager RDMA
support.


Removed all the platform lists, since they won't care about this
part :).

Ah, true.  The eager RDMA code should check that the preprocessor
symbol OPAL_HAVE_ATOMIC_MEM_BARRIER is 1 and disable itself if that
isn't the case.  All the "sections" of ASM support (memory barriers,
locks, compare-and-swap, and atomic math) have preprocessor symbols
indicating whether support exists or not in the current build.  These
should really be used :).

Brian
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#433142: openmpi: FTBFS on GNU/kFreeBSD

2007-07-14 Thread Dirk Eddelbuettel

Petr,

On 14 July 2007 at 22:26, Petr Salinger wrote:
| Package: openmpi
| Severity: important
| Version: 1.2.3-1
| Tags: patch
| User: glibc-bsd-de...@lists.alioth.debian.org
| Usertags: kfreebsd
| 
| 
| Hi,
| 
| the current version fails to build on GNU/kFreeBSD.
| 
| It needs small fixups for munmap hackery and stacktrace.
| It also needs to exclude linux specific build-depends.
| Please find attached patch with that.

Thanks for that patch.

| It would be nice if you can ask upstream
| to include changes to opal/util/stacktrace.c and
| opal/mca/memory/ptmalloc2/opal_ptmalloc2_munmap.c .

Doing so now for their consideration.

Regards, Dirk


| 
| Thanks in advance
| 
|  Petrdiff -u openmpi-1.2.3/debian/control 
openmpi-1.2.3/debian/control
| --- openmpi-1.2.3/debian/control
| +++ openmpi-1.2.3/debian/control
| @@ -3,7 +3,7 @@
|  Priority: optional
|  Maintainer: Debian OpenMPI Maintainers 

|  Uploaders: Dirk Eddelbuettel 
| -Build-Depends: debhelper (>= 5.0.0), dpatch, libibverbs-dev, gfortran, 
libsysfs-dev, automake, gcc (>= 4:4.1.2)
| +Build-Depends: debhelper (>= 5.0.0), dpatch, libibverbs-dev [!kfreebsd-i386 
!kfreebsd-amd64 !hurd-i386], gfortran, libsysfs-dev [!kfreebsd-i386 
!kfreebsd-amd64 !hurd-i386], automake, gcc (>= 4:4.1.2)
|  Standards-Version: 3.7.2
|  XS-Vcs-Svn: svn://svn.debian.org/svn/pkg-openmpi/openmpi/trunk/
|  XS-Vcs-Browser: http://svn.debian.org/wsvn/pkg-openmpi/openmpi/trunk/
| only in patch2:
| unchanged:
| --- openmpi-1.2.3.orig/opal/mca/memory/ptmalloc2/opal_ptmalloc2_munmap.c
| +++ openmpi-1.2.3/opal/mca/memory/ptmalloc2/opal_ptmalloc2_munmap.c
| @@ -26,7 +26,8 @@
|  #elif defined(HAVE_SYSCALL)
|  #include 
|  #include 
| -#elif defined(HAVE_DLSYM)
| +#endif
| +#if defined(HAVE_DLSYM)
|  #ifndef __USE_GNU
|  #define __USE_GNU
|  #endif
| @@ -59,7 +60,7 @@
|  int
|  opal_mem_free_ptmalloc2_munmap(void *start, size_t length, int from_alloc)
|  {
| -#if !defined(HAVE___MUNMAP) && !defined(HAVE_SYSCALL) && defined(HAVE_DLSYM)
| +#if !defined(HAVE___MUNMAP) && !(defined(HAVE_SYSCALL) && 
defined(__NR_munmap)) && defined(HAVE_DLSYM)
|  static int (*realmunmap)(void*, size_t);
|  #endif
|  
| @@ -67,7 +68,7 @@
|  
|  #if defined(HAVE___MUNMAP)
|  return __munmap(start, length);
| -#elif defined(HAVE_SYSCALL)
| +#elif defined(HAVE_SYSCALL) && defined(__NR_munmap)
|  return syscall(__NR_munmap, start, length);
|  #elif defined(HAVE_DLSYM)
|  if (NULL == realmunmap) {
| only in patch2:
| unchanged:
| --- openmpi-1.2.3.orig/opal/util/stacktrace.c
| +++ openmpi-1.2.3/opal/util/stacktrace.c
| @@ -145,8 +145,12 @@
|  case FPE_FLTDIV: si_code_str = "Floating point divide-by-zero"; 
break;
|  case FPE_FLTOVF: si_code_str = "Floating point overflow"; break;
|  case FPE_FLTUND: si_code_str = "Floating point underflow"; break;
| +#ifdef FPE_FLTRES
|  case FPE_FLTRES: si_code_str = "Floating point inexact result"; 
break;
| +#endif
| +#ifdef FPE_FLTINV
|  case FPE_FLTINV: si_code_str = "Invalid floating point 
operation"; break;
| +#endif
|  #ifdef FPE_FLTSUB
|  case FPE_FLTSUB: si_code_str = "Subscript out of range"; break;
|  #endif
| ___
| Pkg-openmpi-maintainers mailing list
| pkg-openmpi-maintain...@lists.alioth.debian.org
| http://lists.alioth.debian.org/mailman/listinfo/pkg-openmpi-maintainers

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison


Re: [OMPI devel] lsf support / farm use models

2007-07-14 Thread Ralph Castain
Welcome! Yes, Jeff and I have been working on the LSF support based on 7.0
features in collab with the folks at Platform.

Some further comments below...

Ralph


On 7/14/07 2:02 PM, "Matthew Moskewicz" 
wrote:

> hi everyone, 
> 
> firstly, i'm new around here, and somewhat clueless when it comes to the
> details of working with an big autoconfiscated project like open-rte/open-mpi
> the svn checkout level ...
> 
> i've read some of the archives that turned up in searches for terms like
> 'LSF', and it would seem there was some discussion about adding some form of
> LSF support to open-rte, but that the discussion ended a while back. so, after
> playing around with the 1.2.3 release tarball for a while, and reading
> various pieces of the code until i  had a (vague) idea of the top-level
> control flow and such, i decided i was ready to try to add ras and pls
> component to support LSF. once i had the build system up, i tried to create an
> ras/lsf directory, and slightly to my surprise, it already existed. i was
> kinda hoping for that, but it appears to be *very* fresh code at the moment.
> nonetheless, i played around a bit more, and ran into two issues:
> 
> 1) it appears that you (jeff, i guess ;) are using new LSF 7.0 API features.
> i'm working to support customers in the EDA space, and it's not clear if/when
> they will migrate to 7.0 -- not to mention that our company (cadence) doesn't
> appear to have LSF 7.0 yet. i'm still looking in to the deatils, but it
> appears that (from the Platform docs) lsb_getalloc is probably just a thin
> wrapper around the LSB_MCPU_HOSTS (spelling?) environment variable. so that
> could be worked around fairly easily. i dunno about lsb_launch -- it seems
> equivalent to a set of ls_rtask() calls (one per process). however, i have
> heard that there can be significant subtleties with the semantics of these
> functions, in terms of compatibility across differently configured
> LSF-controlled farms, specifically with regrads to administrators ability to
> track and control job execution. personally, i don't see how it's really
> possible for LSF to prevent 'bad' users from spamming out jobs or
> short-cutting queues, but perhaps some of the methods they attempt to use can
> complicate things for a library like open-rte.

After lengthy discussions with Platform, it was deemed the best path forward
is to use the lsb_getalloc interface. While it currently reads the enviro
variable, they indicated a potential change to read a file instead for
scalability. Rather than chasing any changes, we all agreed that using
lsb_getalloc would remain the "stable" interface - so that is what we used.

Similar reasons for using lsb_launch. I would really advise against making
any changes away from that support. Instead, we could take a lesson from our
bproc support and simply (a) detect if we are on a pre-7.0 release, and then
(b) build our own internal wrapper that provides back-support. See the bproc
pls component for examples.


> 
> 2) this brings us to point 2 -- upon talking to the author(s) of cadence's
> internal open-rte-like library, several key issues were raised. mainly,
> customers want their applications to be 'farm-friendly' in several key ways.
> firstly, they do not want any persistent daemons running outside of a given
> job -- this requirement seems met by the current open-mpi default behavior, at
> least as far i can tell. secondly, they prefer (strongly) that applications
> acquire resources incrementally, and perform work with whatever nodes are
> currently available, rather than forcing a large up-front node allocation.
> fault tolerance is nice too, although it's unclear to me if it's really
> practically needed. in any case, many of our applications can structure their
> computation to use resources in just such a way, generally by dividing the
> work into independent, restartable pieces ( i.e. they are embarrassingly ||).
> also, MPI communication + MPI-2 process creation seems to be a reasonable
> interface for handling communication and dynamic process creation on the
> application side. however, it's not clear that open-rte supports the needed
> dynamic resource acquisition model in any of the ras/pls components i looked
> at. in fact, other that just folding everything in the pls component, it's not
> clear that the entire flow via the rmgr really supports it very well.
> specifically for LSF, the use model is that the initial job either is created
> with bsub/lsb_submit(),  (or automatically submits itself as step zero
> perhaps) to run initially on N machines. N should be 'small' (1-16) -- perhaps
> only 1 for simplicity. then, as the application runs, it will continue to
> consume more resources as limited by the farm status, the user selection, and
> the max # of processes that the job can usefully support (generally 'large' --
> 100-1000 cpus). 

OpenRTE will be undergoing some changes shortly, so I would strongly
recommend you avoid making major chang

Re: [OMPI devel] Build failures of 1.2.3 on Debian hppa, mips, mipsel, s390, m68k

2007-07-14 Thread Paul H. Hargrove
Brian Barrett wrote:
> On Jul 14, 2007, at 8:26 AM, Dirk Eddelbuettel wrote:
> 
>> Please let us (ie Debian's openmpi maintainers) how else we can  
>> help.  I am
>> ccing the porters lists (for hppa, m68k, mips) too to invite them  
>> to help. I
>> hope that doesn't get the spam filters going...  I may contact the  
>> 'arm'
>> porters once we have a failure; s390 and sparc activity are not as  
>> big these
>> days.
> 
> Open MPI uses some assembly for things like atomic locks, atomic  
> compare and swap, memory barriers, and the like.  We currently have  
> support for:
> 
>* x86 (32 bit)
>* x86_64 / amd64 (32 or 64 bit)
>* UltraSparc (v8plus and v9* targets)
>* IA64
>* PowerPC (32 or 64 bit)
> 
> We also have code for:
> 
>* Alpha
>* MIPS (32 bit NEW ABI & 64 bit)
> 
> This support isn't well tested in a while and it sounds like it  
> doesn't work for MIPS.  At one time, we supported the sparc v8  
> target, but that The other platforms (hppa, mipsel (how is this  
> different than MIPS?), s390, m68k) aren't at all supported by Open  
> MPI.  If you can get the real error messages, I can help on the MIPS  
> issue, although it'll have to be a low priority.
> 

As maintainer of the atomics code for two projects unrelated to OpenMPI,
I thought I'd pass on some of my insight.  I'll not post any code here
to avoid any accidental license questions.

HPPA lacks an atomic compare-and-swap and is therefore probably a lost
cause.  The Linux kernel uses HPPA's only atomic instruction,
load-and-clear, to implement a spinlock and a hashed table of spinlocks
to implement atomic operations.  This works because the atomic_read and
atomic_set macros honor the spinlocks.  This is not the case with ompi's
atomics, is it?  OpenMPI appears to contain fragments of such an
array-of-spinlocks implementation for SPARCv8, but Brian's comments
suggest to me that this may no longer work.

ARM before v6 needs no memory barriers, but lacks atomic instructions
other than unconditional swap (though very few multi-processor systems
were built with earlier chips).   However, on the libc-ports mailing
list (http://sourceware.org/ml/libc-ports/2005-10/msg00016.html) says of
the code used in glibc

/* Atomic compare and exchange.  These sequences are not actually Atomic;
   there is a race if *MEM != OLDVAL and we are preempted between the two
   swaps.  However, they are very close to atomic, and are the best that a
   pre-ARMv6 implementation can do without operating system support.
   LinuxThreads has been using these sequences for many years.  */

So, ompi might try getting away with the same logic if an ARM port is
high priority for somebody.  Alternatively, if one is on a new enough
Linux kernel (>= 2.6.12 IIRC) you get kernel support for CAS by calling
to a function in a "highpage" (like the VDSO on x86) that is implemented
natively on >=ARMv6 and traps to the kernel otherwise (the kernel
disables interrupts and then uses the not-quite-atomic sequence).

For ARMv6 you get a load-exclusive and store-exclusive pair, and you get
real memory barriers as well.

M68K has a CAS instruction and memory barriers are no-ops.  This should
be an easy one to implement from the instruction set reference docs.

s390 is one I don't have any first-hand experience with but know from
peeking at the Linux kernel source that it has the eieio memory-barrier
instruction of early PPCs and a CAS instruction.  Again, should be easy
from the ISA docs.

MIPS is supposed to work w/ ompi on IRIX, but there is no
atomic-mips-linux.s on OpenMPI 1.2.3.  I was going to try to build 1.2.3
on an O2K (IRIX64 6.5 and gcc 3.3) today, but found that configure dies with
  configure: error: Could not determine global symbol label prefix
So, I'll not be pursuing that.


-Paul

> We don't currently have support for a non-assembly code path.  We  
> originally planned on having one, but the team went away from that  
> route over time and there's no way to build Open MPI without assembly  
> support right now.
> 
> 
> Brian
> 


-- 
Paul H. Hargrove  phhargr...@lbl.gov
Future Technologies Group
HPC Research Department   Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory Fax: +1-510-486-6900